We have regression tests using Selenium which picks a value from a web page as the actual result. The data in the web UI is sourced from Elastic Search.
The regression tests that have been written are comparing this UI value, directly with the original data in SQL Server before the transfer to Elastic.
My question is:
Should the regression test look for the expected result in SQL Server or Elastic search?
If we pull expected data from SQL server, then we are including the data transfer processing from SQL to Elastic in the test.
If we pull expected data from Elastic, we are just testing the UI and layers down to Elastic, but not the DB --> Elastic configuration.
I can see the benefits of both methods. Any thoughts
Related
I am working on a project where i need to display the database mssql server's performance metrics for example memory consumed/free, storage free space etc. I have researched for this purpose and one thing came up was DOGSTATSD.
Datadog provides the library for .net project to get custom metrics but that was not the solution for me because the metrics works on datadog website. I have to display the all (in graph or whatever suited) data, received from MSSQL SERVER. There will be multiple servers/instances.
Is there a way to do that, our WebApp connected with multiple databases and we receive/display information.
I cannot use already available tools for the insights.
You can easily get all needed data via querying dmv and other resources inside SQL Server. Good start is here.
My current model looks like this:
Gather disparate data sources and import into SQL Server.
Process and transform data using SSIS packages.
final step in the SSIS package uploads data to the data warehouse.
BI tools pull data from the data warehouse for end users.
Is this a logical work flow? I initially was going to use data factory and the Azure SSIS integration runtime to process data. However I didn't understand why these steps were needed, as it would seem simpler in my situation just to build my SSIS packages on premises and upload the processed data to my data warehouse. What benefits would I gain from using data factory and the integration runtime? My main concern is that my current model will make automation difficult but I'm not entirely sure. Any help is appreciated.
Your possible paths here would be SSIS on prem, SSIS on VM in Cloud, SSIS in ADF or natively build the pipelines in ADF.
ADF is an Azure Cloud PaaS managed service for data movement and data integration orchestration. To reach back into on-prem data sources, you need to use an Integration Runtime gateway on the source side. So, if you are looking to move to a Cloud-first architecture or migrating into Azure, ADF is a good solution (use V2).
If you are remaining all on-prem SSIS on-prem is the best scenario.
If this is hybrid, where you will continue to have some data on prem and load Azure Data Warehouse in the Cloud, then you can still use SSIS on prem with connectors into ADW as the target. Or if you have to eliminate the local server concept, you can run that SSIS in a VM in Azure.
If you want to eliminate both the datacenter server and the need to patch, maintain, etc. the SSIS server, then use SSIS in ADF, which provides SSIS as a Service. In that case, you can still move data in a hybrid manner.
It really is going to depend on factors such as are you comfortable more in Visual Studio to develop SSIS jobs or do you want to build the pipelines in JSON in ADF? Do you have a plan or a need to move to Cloud? Do you want to move to a Cloud-Managed service (i.e. ADF V2)?
I hope that helps!!
I have an existing SQL Server database that uses Full Text Search and Semantic search for the UI's primary searching capability. The tables used in the search contain around 1 million rows of data.
I'm looking at using Azure Search to replace this, however my database relies upon the Full Text Enabled tables for it's core functionality. I'd like to use Azure Search for the "searching" but still have my current table structure in place to be able to edit records and display the detail record when something has been found.
My thoughts to implement this is to:
Create the Azure indexes
Push all of the searchable data from the Full Text enabled table in SQL Server to Azure Search
Azure Search to return ID's of documents that match the search criteria
Query the existing database to fetch the rows that contain those ID's to display on the front end
When some data in the existing database changes, schedule an update in Azure Search to ensure the data stays in sync
Is this a good approach? How do hybrid implementations work where your existing data is in an on-prem database but you want to take advantage of Azure Search?
Overall, your approach seems reasonable. A couple of pointers that might be useful:
Azure SQL now has support for Full Text Search, so if moving to Azure SQL is an option for you and you still want to use Azure Search, you can use Azure SQL indexer. Or you can run SQL Server on IaaS VMs and configure the indexer using the instructions here.
With on-prem SQL Server, you might be able to use Azure Data Factory sink for Azure Search to sync data.
I actually just went through this process, almost exactly. Instead of SQL Server, we are using a different backend data store.
Foremost, we wrote an application to sync all existing data. Pretty simple.
For new documents being added, we made the choice to sync to Azure Search synchronously rather than async. We made this choice because we measured excellent performance when adding to and updating the index. 50-200 ms response time and no failures over hundreds of thousands of records. We couldn't justify the additional cost of building and maintaining workers, durable queues, etc. Caveat: Our web service is located in the same Azure region as the Azure Search instance. If your SQL Server is on-prem, you could experience longer latencies.
We ended up storing about 80% of each record in Azure Search. Obviously, the more you store in Azure Search, the less likely you'll have to perform a worst-case serial "double query."
I am new to data factory. For a while I worked on Azure SQL database. Till now all data transformation operations (which includes data movement, processing, modification of data, fuzzy grouping and fuzzy lookup) are performed manually on my system through SSIS. Now we want to automate all the packages. For that we want to schedule these packages on Azure. I know that Azure SQL has no support for SSIS and someone suggested data factory. Let me know if data factory can perform all my requirements mentioned above.
Thanks in advance...
Data Factory is not a traditional ETL tool but a tool to orchestrate, schedule and monitor data pipelines that compose existing storage, movement, and processing services. When you transform data with ADF the actual transformation is done by another service (Hive/Pig script running on HDInsight, Azure Batch, USQL running on Azure Data Lake Analytics, SQL Server Stored Proc, etc.) and ADF manages and orchestrates the complex scheduling and cloud resources. ADF doesn't have traditional 'out of the box' ETL transforms (like fuzzy lookup). You can write your own scripts or custom .NET code for your business logic, or run stored procedures. You can compose all of this into a recurring scheduled data pipeline(s) and monitor all in one place.
It seems easy to apply an Azure Search index to an SQL Azure database. I undertand that you query the search index using REST APIs and that the index then needs to be maintained/updated. Now, consider a web server running IIS, with an underlying SQL Server database.
What is considered best practice; querying and updating the index from the web server or from SQL Server, e.g. from within a CLR stored procedure? Are there specific design considerations here?
I work on Azure Search team and will try to help.
Querying and updating the index are two different use cases. Presumably, you want to query the index in response to user input in your Web app. (It is also possible that you have a SQL stored procedure with some complex logic that needs full test search, but that seems less likely).
Updating the index can be done in multiple ways. If you can tolerate updating your index at most every 5 minutes, use Azure Search SQL indexer automagically update the index for you - see http://azure.microsoft.com/en-us/documentation/articles/search-howto-connecting-azure-sql-database-to-azure-search-using-indexers-2015-02-28/ for details on how to do it. That article describes creating indexers using REST API, but we now have support for that in .NET SDK as well.
OTOH, if you need hard real-time updates, you can update the search index at the same time you produce data to insert / update your SQL database.
Let me know if you have any follow up questions!
HTH,
Eugene