We have a partner that is able to provide us customer's data through sFTP server connection (they have a sFTP server in their side that can transfer data packages every 24 hours).
Which services (best one perhaps) of AWS I can use to make our partner able to send the data in an scheduled way?
I need some information about what would be the best practice and perhaps link to AWS documentation.
Related
Lets say we have postgres and mongodb server and we sharded.
How does the database knows the specific database server to query a certain record.
or do we have to implement the logic in application layer
Does it differs for sql and no sql database
databases are divided into two parts.
server and client.
you can have few installed servers on the same machine and even a few on another machine, but every time you wanna use a client you will have to connect to a specific server.
if you connect your app to some DB - your app will act as a client, so you will have to connect from your app to some specific server by specifying a network address and port number.
Either your client/app server can decide which database server to direct the query to based on application-embedded logic, or you can have a coordinator node which your client connects to and which then routes the query as appropriate, based on look-up tables it keeps, and makes it all transparent to the client.
It is your hypothetical setup, we can't know what you did; you have to tell us. Did you use some commercial or open-source add-on to implement this?
I need to connect Angular with redshift for historical reporting. Can it be achieved, what are the prerequisites ?
This is possible in theory using the Redshift Data API, but you should consider whether you truly want your client machine to be writing and executing SQL commands directly to Redshift.
To allow this the following are true:
The client machine will send the SQL to be executed, a malicious actor could modify this so permissions would be important.
You would need to generate IAM credentials via a service like Cognito to directly interact with the API.
It would be more appropriate to create an API that can directly communicate with Redshift offering protection on the SQL that can be executed.
This could use API Gateway and Lambda to keep it simple, with your frontend calling this instead of directly writing the SQL.
More information is available in the Announcing Data API for Amazon Redshift post.
I have a project where there will be a master source database, client windows service, client application, client database.
We need to have a client database because there are times when the client won't be able to reach out to the master database source due to connectivity issues.
I was hoping to get some of your expertise on what would the best/most efficient way to sync certain tables from client database to master database and other tables from master database to client database in close to real-time (within minutes). I would also need to keep track of what was synced in the master database so that I can use it in a dashboard.
There could be up til 10,000 + clients trying to pull this information all at once.
Any suggestions would be helpful.
You may be interested in a scheme based on Apache Kafka technologies.
Apache Kafka has the ability to create connectors.
The architectural scheme will look like this:
Local DB - Connector - Apache Kafka - Connector - Server DB.
Connectors support different databases.
You can use connectors to connect to the database itself or to separate tables.
Your scheme is similar to the ETL scheme based on Apache Kafka technologies.
You can also develop an application that will track what was synced in the main database using Apache Kafka Streams.
The other way, if you don't use Apache Kafka, is to master - master replication.
Our client doesn't want to let us make any call in their SQL database (even create a replica, etc). The best solution we have thought until now is instantiating a Google Cloud SQL server, so we can ask customer to push its data once a day/week (using the public IP of the server) and then we consume the data pushing into Google Big Query.
I have been reading many topics on the web and my possible solution is asking user doing weekly ETL -> Cloud SQL -> BigQuery. Is it a good approach?
To sum up, I am looking for recommendations about best/cheap practices and possible ways to let the user insert data in GCP without exposing his data or my infrastructure.
My cloud provider is Google Cloud and my client uses SQL Server.
We are open to new or similar options (even other providers like Amazon and Azure)
Constraints:
Client will send data periodically (once a day/or week ingestion)
Data finally should be sent and stored in BigQuery
The costs of having a Cloud SQL in Google is high while we don't need the allocated CPU/Memory and public IP available 24/7 (only a few times a month, e.g: 4 times a month)
The question is missing many details, but how about:
Have the customer create a weekly .csv.
Send the .csv with the new data to GCS.
Load into BigQuery.
we have an internal SQL Server 2008R2 db that we'd like to expose (partially - only some tables) to our clients via Internet, so they can feed their Excel reports. What are our best options? How should we provide security (ie. Should we create another, staging DB server on DMZ for this?). As far as quantity to transfer, it's very small (< 100 recs).
Here would be one simple way to start with if they need live, real-time access:
Create a custom SQL user account for web access, locked down with read-only access to the relevant tables or stored procedures.
Create a REST web service that connects to the database using the SQL Account above. Expose methods for each set of data that can be retrieved.
Make sure the web service runs over SSL (HTTPS) and requires username/password authentication - for example via BASIC auth with custom hard-coded account per client.
Then when the clients need to retrieve data, they can access a specific URL and receive data in CSV format or whatever is convenient for their reports. Also, REST web services are easily accessed via XMLHTTPObject if you have clients that are technically-savvy and can write VBA macros.
If the data is not needed real-time - for instance, if once a day is often enough, you could probably just generate .csv output files and host them somewhere the clients can download manually through their web browser. For instance, host on an FTP site or simple IIS website with BASIC authentication.
If data is not needed real-time, the other alternative is use SSIS or SSRS to export excel file, and email to your clients.