Our organization uses Elastic Logstash & Kibana (ELK) and we use a SQL Server data warehouse for analysis and reporting. There are some data items from ELK that we want to copy into the data warehouse. I have found many websites describing how to load SQL Server data into ELK. However, we need to go in the other direction. How can I transfer data from ELK to SQL Server, preferably using SSIS?
I have implemented a similar solution in python, where we are ingesting data from elastic cluster into our sql dwh. You can import Elasticsearch package for python which allows you to do that.
you can find more information here
https://elasticsearch-py.readthedocs.io/en/master/
Related
I need to design a scalable database architecture in order to store all the data coming from flat files - CSV, html etc. These files come from elastic search. most of the scripts are created in python. This data architecture should be able to automate most of the daily manual processing performed using excel, csv, html and all the data will be retrieved from this database instead of relying on populating within csv, html.
Database requirements:
Database must have a better performance to retrieve data on day to day basis and it will be queried by multiple teams.
ER model, schema will be developed for the data with logical relationship.
The database can be within cloud storage.
The database must be highly available and should be able to retrieve data faster.
This database will be utilized to create multiple dashboards.
The ETL jobs will be responsible for storing data in the database.
There will be many reads from the database and multiple writes each day with lots of data coming from Elastic Search and some of the cloud tools.
I am considering RDS, Azure SQL, DynamoDB, Postgres or Google Cloud. I would want to know which database engine would be a better solution considering these requirements. I also want to know how ETL process should be designed- lambda or kappa architecture.
To store the relational data like CSV and excel files, you can use relational database. For flat files like HTML, which doesn't required to be queried, you can simply use Storage account in any cloud service provider, for example Azure.
Azure SQL Database is a fully managed platform as a service (PaaS) database engine that handles most of the database management functions such as upgrading, patching, backups, and monitoring without user involvement. Azure SQL Database is always running on the latest stable version of the SQL Server database engine and patched OS with 99.99% availability. You can restore the database at any point of time. This should be the best choice to store relational data and perform SQL query.
Azure Blob Storage is Microsoft's object storage solution for the cloud. Blob storage is optimized for storing massive amounts of unstructured data. Your HTML files can be stored here.
The ETL jobs can be performed using Azure Data Factory (ADF). It allows you to connect almost any data source (including outside Azure) to transform the stored dataset and store it into desired destination. Data flow transformation in ADF is capable to perform all the ETL related tasks.
I am new to Azure and have no prior experience or knowledge regarding working with Azure data warehouse systems (now Azure Synapse Analytics Framework)
I have access to a "read only" data warehouse (not in Azure) that looks like this:
I want to replicate this data warehouse as it is on Azure cloud. Can anyone point me to the right direction (video tutorials or documentation) and the number of steps involved in this process? There are around 40 databases in this warehouse. And what if I wanted to replicated only specific ones?
We can't do that you only have the read only permisson. No matter which data warehouse, we all need the server admin or database owner permission to do the database replicate.
You can easily get this from the all documents relate to the database backup/migrate/replicate, for example: https://learn.microsoft.com/en-us/sql/t-sql/statements/backup-transact-sql?view=sql-server-ver15#permissions,
If you have enough permission then you can to that. But for Azure SQL datawarehouse, now we called SQL pool (formerly SQL DW), we can't replicate other from on-premise datawarehouse to Azure directly.
The official document provide a way import the data into to Azure SQL pool((formerly SQL DW)):
Once your dedicated SQL pool is created, you can import big data with
simple PolyBase T-SQL queries, and then use the power of the
distributed query engine to run high-performance analytics.
You also could use other ETL tool to achieve the data migration from on-premise datawarehouse to Azure. For example using Data Factory, combine these two tutorials:
Copy data to and from SQL Server by using Azure Data Factory
Copy and transform data in Azure Synapse Analytics by using Azure
Data Factory
I have a SQL Server 2012 hosted on a standalone machine. I want to migrate it to my AWS Redshift (already existing data warehouse).
My question is wether it is possible via AWS Data migration service ?
I am also open to other efficient methods for migration. Currently I am doing the following steps
taking a backup of the SQL server DB in the standalone server.
uploading it to AWS-S3.
Droping and restoring the Db from S3 in AWS-RDS (Sql-server)
I would like this data to be present in my data warehouse i.e AWS-Redshift
Thanks for the help in advance !
There are 2 types of migration within DMS
"one off" data migration, where the data is copied using sql
statements
"continuous replication", where the "change dta capture" system on
the source is used to capture and process just the updates.
SQL server can be used as a source for both of these types however there are caveats and limitations that should be read and understood thoroughly.
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.SQLServer.html
So long as you follow the instructions and meet the limitiations that are documented then it will work great.
I use several Microsoft Access databases on a regular basis to create reports. To get the source data, I currently have to log in to SAP BW (via SAP NetWeaver), run the source data report, export the results as a .csv file (but actually saving it as a .txt file), and then import that file into Microsoft Access. Is there a way that I can have Access pull the data from SAP BW directly?
Any help is appreciated!
All of the databases used by SAP are industry standard databases and the data is thus going to be stored in a system that supports ODBC.
As far as I know, SAP in general uses Sybase which is also what SQL server was originally based on.
So SAP is running on an industry standard SQL server (Sybase or SQL server). If running on IBM, then the data is in DB2 (often the as400 system).
You thus simply need to contact your IT department and obtain the required ODBC connection strings to the database. You “might” also need to install the latest Sybase drivers if you not running SAP on SQL server but again such information would be available from your SAP support folks.
So you simply setup linked tables in access to the SAP database, and thus no export or download or importing of data is required – you be reporting on live data at all times. The “challenge” is thus of course to grasp the table structures in SAP - a LARGE challenge since in most cases a report you been using for exports is the result of MANY related tables joined together into a "easy" view for exporting. So be prepared for some complex quires to get the data the way you want.
I'm currently doing a business intelligence research about connecting Microsoft SQL Server to a nosql database.
My target is to import data from a nosql table to a relational DWH based on SQL Server.
I found the following approaches:
Microsoft Hadoop Connector
Hadoop Cloudera
Building an individual script and create an xml and include it via Integration Services (not really satisfying)
If somebody did something like this before or knows some kind of "best practices". It doesn't matter wich NoSQL system is used
NoSQL, by "definition", does not have a standard structure. So, depending on what NoSQL backend you are trying to import from, you will need some custom code to translate that into whatever structured format your data warehouse expects.
Your code does not have to generate XML; it could directly use a database connection (e.g., JDBC, if you are using Java) to make SQL queries to insert the data.