How can I change the data in TDengine? - database

I'm using TDengine database for collected sensor data. Sometimes, there will be some error in collected data, and they will be also stored in the database. The data is inserted by timestamp.
How can I change them when I found the error?

you cannot if your database option is 0, but you can update the data by using
insert into dbname.tableName values(timestamp_value, col1_value, ...)
if your database option is 1 or 2. just keep in mind that the timestamp_value should keep the same with the data you want to update.

Related

Does original database change in knime

When using the database manipulation nodes, does the original database data would change or remain the same?
Can anyone answer please?
If you write to or update the database, the original database will be updated. Read or query operations such as row / column filtering, row sampling, groupby and joiner do not change the underlying database

Master Data Services

I'm a SQL Server developer learning MDS. I loaded some entities via staging tables and via Excel add-in.
I'm trying to update members in an entity in MDS via the staging table. I can successfully add new members, but any attribute updates to existing members aren't populated to the entity view. The import process runs successfully with no errors.
I've tried ImportType = 0 and 2, neither works. When I set to 1, as expected I get an error. I also tried to update the code value using the NewCode column and that also does not get updated.
I've set up staging data with an SSIS package, and also with direct T-SQL INSERT INTO statement.
I am using almost the same T-SQL INSERT statement for a test entity which I created to load a new member, and then to modify attributes for the new member in a second batch.
Do you have any ideas why the updates would be ignored, or suggestions for things I can try?
Look at your batch in the staging table to see if an errors occurred. If the "ImportStatus_ID" = 2 then the record failed to import. You can see the reason for failure by querying the view that shows reasons for the import failures. The view will be named "stg.viw_EntityName_MemberErrorDetails.
Here is a Microsoft link for reference:
https://technet.microsoft.com/en-us/library/ff486990(v=sql.110).aspx
Hope this helps.
As suggested above Member error details view describe the error
Make sure that you are checking below points when updating in MDS
1) Put code column in your INSERT statement
2) Include all columns of staging table in INSERT query when using
importType = 2 (Otherwise all column will be updated as NULL)
You should insert the data into staging table with ImportType as 0 or 2 along with the batchtag and then run the staging stored procedure to load the data from staging table to entity table. SP will compare the data from staging table with the data in entity table based on Code value and update the data in entity table.
While you can update importstatus_id in the stg.leaf table.
update stg.C_Leaf
set
ImportStatus_ID = 0
While I think it will force the data to be ready for staging and load to mdm entity.
Using Import type =0 shall help u update the new attributes untill the updated new attribute has NoT Null Data. If it is so then Update shall fail. Recheck the data in entity.
If that doesn't work. Please try to refresh the cache in Model and try to get the entiry details again.
Learn more about import types in MDS from below link:
https://learn.microsoft.com/en-us/sql/master-data-services/leaf-member-staging-table-master-data-services?view=sql-server-2017
Hope this helps.

How to transfer only new records between two different databases (ie. Oracle and MSSQL) using SSIS?

Do you know how to transfer only new records between two different databases (ie. Oracle and MSSQL) using SSIS? There is no problem transfering new data only between two tables in the same database and server, but is this possible to do such operation between completely different servers and databases?
Ps. I know about solution using Lookup but it is not very efficient if anybody needs to check and add a lot of records (50k and more) several times per day. I would like to operate with new data only.
You have several options:
Timestamp based solution
If you have a column which stores the insertation time in the source system, you can select only the new records created since the last load. With the same logic, you can transfer modified records too, just mark the records with the timestamp value when it change.
Sequence based solution
If there is a sequence in the source table, you can load the new records based on that sequence. Query the last value from the destination system, then load avarything which is larger than that value.
CDC based solution
If you have CDC (Change Data Capture) in your source system, you can track the changes and you can load them based on the CDC entries.
Full load
This is the most resource hungry solution: you have to copy all data from the source to the destination. If you do not have any column which marks the new records, you should use this solution.
You have several options to achieve this:
TRUNCATE the destination table and reload it from source
Use a Lookup component to determine which records are missing
Load all data from source to a temporary table and write a query which retrieves the new/changed records.
Summary
If you have at least one column, which marks the new/modified records, you can use it to implement a differential/incremental load with SSIS. If you do not have any clue, which columns/rows are changed, you have to load (or at least query) all of them.
There is no solution which enables a one-query (INSERT .. SELECT) solution using multiple servers without transferring all data. (Please note, that a multi-server query using Linked Servers are transfers the data from the source system).
What about variables? Is it possible to use the same variable between different databases and servers in SSIS?
I would like to transfer last id number from a destination table and transfer it to the source table (different server!).
I can set a variable in a database scope like this:
DECLARE #Last int
SET #Last = (SELECT TOP 1 Id FROM dbo.Table_1 ORDER BY Id DESC)
SELECT *
FROM dbo.Table_2
WHERE ID > #Last;
However it works between two tables in the same database (as a SQL command) only. I can create a variable for a entire SSIS package in Variables --> Add variable, but I don't know it is possible to use the variable in a similar way as above - to keep an information about last id in a destination table and pass it to another table on a source server as data limit.

Recommended way of adding daily data to database

I receive new data files every day. Right now, I'm building the database with all the required tables to import the data and perform the required calculations.
Should I just append each new day's data to my current tables? Each file contains a date column, which would allow for a "WHERE" query in the future if I need to analyze data for one particular day. Or should I be creating a new set of tables for every day?
I'm new to database design (coming from Excel). I will be using SQL Server for this.
Assuming that the structure of the data being received is the same, you should only need one set of tables rather than creating new tables each day.
I'd recommend storing the value of the date column from your incoming data in your database, and also having a 'CreateDate' column in your tables, with a default value of 'GetDate()' so that it automatically gets populated with the current date when the row is inserted.
You may also want to have another column to store the data filename that the row was imported from, but if you're already storing the value of the date column and the date that the row was inserted, this shouldn't really be necessary.
In the past, when doing this type of activity using a custom data loader application, I've also found it useful to create log files to log success/error/warning messages, including some type of unique key of the source data and target database - ie. if coming from an Excel file and going into a database column, you could store the row index from Excel and the primary key of the inserted row. This helps tracking down any problems later on.
You might want to consider having a look at SSIS (SqlServer Integration Services). It's the SqlServer tool for doing ETL activities.
yes, append each day's data to the tables; 1 set of tables for all data.
yes, use a date column to identify the day that the data was loaded.
maybe have another table with a date column and a clob column. The date to contain the load date and the clob to contain the file that you imported.
Good question. You most definitely should have a single set of tables and append the data daily. Consider this: if you create a new set of tables each day, what would, say, a monthly report query look like? A quarterly report query? It would be a mess, with UNIONs and JOINs all over the place.
A single set of tables with a WHERE clause makes the querying and reporting manageable.
You might do a little reading on relational database theory. Wikipedia is a good place to start. The basics are pretty straightforward if you have the knack for it.
I would have the data load into a stage table regardless and append to the main tables after. Once a week i would then refresh all data in the main table to ensure that the data remains correct as per the source.
Marcus

Populate SQL database from textfile on a background thread constantly

Currently, I would like provide this as an option to the user when storing data to the database.
Save the data to a file and use a background thread to read data from the textfile to SQL server.
Flow of my program:
- A stream of data coming from a server constantly (100 per second).
- want to store the data in a textfile and use background thread to copy data from the textfile back to the SQL database constantly as another user option.
Has this been done before?
Cheers.
Your question is indeed a bit confusing.
I'm guessing you mean that:
100 rows per second come from a certain source or server (eg. log entries)
One option for the user is textfile caching: the rows are stored in a textfile and periodically an incremental copy of the contents of the textfile into (an) SQL Server table(s) is performed.
Another option for the user is direct insert: the data is stored directly in the database as it comes in, with no textfile in between.
Am I right?
If yes, then you should do something in the lines of:
Create a trigger on an INSERT action to the table
In that trigger, check which user is inserting. If the user has textfile caching disabled, then the insert can go on. Otherwise, the data is redirected to a textfile (or a caching table)
Create a stored procedure that checks the caching table or text file for new data, copies the new data into the real table, and deletes the cached data.
Create an SQL Server Agent job that runs above stored procedure every minute, hour, day...
Since the interface from T-SQL to textfiles is not very flexible, I would recommend using a caching table instead. Why a textfile?
And for that matter, why cache the data before inserting it into the table? Perhaps we can suggest a better solution, if you explain the context of your question.

Resources