I have source tables in Snowflake and Destination tables in Snowflake.
I need to load data from source to destination using ADF.
Requirement: I need to load data using single pipeline for all the tables.
Eg: For suppose i have 40 tables in source and load the total 40 tables data to destination tables. I need to create a single pipeline to load all tables at a time.
Can anyone help me in achieving this?
Thanks,
P.
This is a fairly broad question. So take this all as general thoughts, more than specific advice.
Feel free to ask more specific questions, and I'll try to update/expand on this.
ADF is useful as an orchestration/monitoring process, but can be tricky to manage the actual copying and maneuvering of data in Snowflake. My high level recommendation is to write your logic and loading code in snowflake stored procedures
then you can use ADF to orchestrate by simply calling those stored procedures. You get the benefits of using ADF for what it is good at, and allow Snowflake to do the heavy lifting, which is what it is good at.
hopefully you'd be able to parameterize procedures so that you can have one procedure (or a few) that takes a table name and dynamically figures out column names and the like to run your loading process.
Assorted Notes on implementation.
ADF does have a native Snowflake connector. It is fairly new, so a lot of online posts will tell you how to set up a custom ODBC connector. You don't need to do this. Use the native connector and auto resolve integration and it should work for you.
You can write a query in an ADF lookup activity to output your list of tables, along with any needed parameters (like primary key, order by column, procedure name to call, etc.), then feed that list into an ADF foreach loop.
foreach loops are a little limited in that there are some things that you can't nest inside of a loop (like conditionals). If you need extra functionality, you can have the foreach loop call a child ADF pipeline (passing in those parameters) and have the child pipline manage your table processing logic.
Snowflake has pretty good options for querying metadata based on a tablename. See INFORMATION_SCHEMA. Between that and just a tiny bit of javascript logic, it's not too bad to generate dynamic queries (e.g. with column names specific to a provided tablename).
If you do want to use ADF's Copy Activities, I think You'll need to set up an intermediary Azure Storage Account connection. I believe this is because it uses COPY INTO under the hood which requires using external storage.
ADF doesn't have many good options for avoiding running one pipeline multiple times at once. Either be careful about making sure that your code can handle edge cases like this, or that your scheduling/timeouts won't allow for that scenario with a pipeline running too long.
Extra note:
I don't know how tied you are to ADF, but without more context, I might suggest a quick look into DBT for this use case. It's a great tool for this specific scenario of Snowflake to Snowflake processing/transforming. My team's been much happier since moving some of our projects from ADF to DBT. (not sponsored :P )
For my new project I'm looking forward to use JSON data as a text file rather then fetching data from database. My concept is to save a JSON file on the server whenever admin creates a new entry in the database.
As there is no issue of security, will this approach will make user access to data faster or shall I go with the usual database queries.
JSON is typically used as a way to format the data for the purpose of transporting it somewhere. Databases are typically used for storing data.
What you've described may be perfectly sensible, but you really need to say a little bit more about your project before the community can comment on your approach.
What's the pattern of access? Is it always read-only for the user, editable only by site administrator for example?
You shouldn't worry about performance early on. Worry more about ease of development, maintenance and reliability, you can always optimise afterwards.
You may want to look at http://www.mongodb.org/. MongoDB is a document-centric store that uses JSON as its storage format.
JSON in combination with Jquery is a great fast web page smooth updating option but ultimately it still will come down to the same database query.
Just make sure your query is efficient. Use a stored proc.
JSON is just the way the data is sent from the server (Web controller in MVC or code behind in standind c#) to the client (JQuery or JavaScript)
Ultimately the database will be queried the same way.
You should stick with the classic method (database), because you'll face many problems with concurrency and with having too many files to handle.
I think you should go with usual database query.
If you use JSON file you'll have to sync JSON files with the DB (That's mean an extra work is need) and face I/O problems (if your site super busy).
I wasn't sure whether to ask this here or on SuperUser, so I apologize if it doesn't belong here.
I created a small PHP/MySQL database app to manage the customer loyalty data for my mom's shop, intending to set it up locally on her cash register computer with XAMPP. However, I've been asked to reimplement the system in a GUI relational database such as MS Access or OpenOffice Base, primarily so that she can do things like mail merge and graphical reports with a GUI (that I don't have to write).
I can easily replicate my MySQL table structure and relationships, and create a few of the more basic forms and reports, but I've never done any scripting, macros etc in Access or Base. My PHP handled a lot more than just form input, there was some scripting involved that I don't know how to implement in Access / Base. Worth noting: if I end up using Access, it'll be Access 2007.
Here's a quick overview of what I'm trying to make, in case it helps. Sorry for the length.
The business is a take & bake food market, and the database is replacing a physical stamp-card loyalty system. Each customer gets a stamp on their card for every $25 they spend. They earn free meals as follows:
- On the 8th stamp, they earn a free side dish.
- On the 16th stamp, they earn a free regular size meal.
- On the 24th stamp, they earn a free family size meal, and their card resets to zero stamps.
The date of each stamp must be recorded (otherwise I'd just increment one field instead of having a stamps table).
I have 3 tables: customers, stamps, and freebies. customers has a 1-to-many relationship with both stamps and freebies.
customers is a simple contact list.
columns: ID, firstname, lastname, email, phone
stamps keeps records of each stamp earned.
columns: ID, customerID, date, index (1-24; the Nth stamp on that customer's card)
freebies keeps records of each free meal they have earned.
columns: ID, customerID, date, size, is_redeemed
Here's the magic from my PHP that I don't know how to implement in Access/Base:
When a user selects a customer and clicks an "add a stamp" button:
stamps is queried to grab the index from the last stamp for that customer => local variable N
if N == 24, set N = 0. Increment N by 1.
a record is inserted to stamps with the current date, customer id and an index of N
if N == 8, 16 or 24 a record is inserted into freebies with the appropriate size and an alert appears to notify the user that the customer earned some free shit.
Some kind of "view customer" page (form? report?) that shows all the stamps and freebies they've earned, with "redeem" buttons next to the freebies that have not been redeemed.
In general I need to make it fairly idiot-proof and "big-button" -- automation wherever possible -- cashiers at the shop should be able to use it with no prior knowledge of databases.
Is this practical in a program like Access or Base, or should I just convince her to use my PHP version? If I need to write code, what language(s) do I need to teach myself? Should I be structuring my data differently? I'm not sure where to start here.
Really I think this would be a piece of cake. It's true like Tony said that you can continue to use the same tables/backend and that's probably the route I'd recommend. You'll need to install MySQL's ODBC drivers on any machine that will be linking to the MySQL database. After that create a DSN and then access the tables through that from within Access. You may want to add code later to relink the tables every time the software loads using DSN-less tables. This way the database can run on a machine that doesn't have a DSN configured. I do recommend that you go with either MySQL or SQL Server Express as opposed to an MS Access backend but I'm not going to take the time to elaborate on why.
I think you can actually get much more functionality from a traditional Windows Desktop Application (built in MS Access or VB.Net) than you could with PHP. And it's my own opinion that you'll be able to do it with less code and less time invested. I mentioned VB.Net but I'd probably recommend MS Access over VB.Net for databases although either one will do the job.
As Tony already mentioned, Access uses VBA language. It takes a little while to really pick it up unless you already have some experience with other programming languages that use the Basic syntax. I've found that moving from VBA/ASP to PHP/Javascript has been slow going though not necessarily so difficult. PHP uses the C style code with curly braces and VBA does not.
Coming from PHP, here's some things that may be new to you:
Stronger Variable Typing - In Access you can actually declare your variables with a specified data type such as String, Date, Integer, Long, Single, Double, etc. I recommend using this as much as possible. There are very few times when you will need to use the more general types such as Object or Variant. Variables declared with a specified data type will throw an error if you attempt to put the wrong data type into them. This helps you write better code, in my opinion.
Option Explicit - Option Explicit is a declaration you can put at the top of each code module to enforce that you have to declare a variable with a Dim statement before using it. I highly recommend that you do this. It will save you a lot of time troubleshooting problems.
Set MyVariable = Nothing - Cleaning up object variables after using them is one of the best practices of using MS Access. You'll use this to clean up DAO Recordset variables, ADO Connection variables, ADO Recordset variables, form variables, etc. Any variable that you declare as an object (or some specific type of object) should get cleaned up by setting it to Nothing when you no longer need to use the variable.
No Includes - There is no such thing as an Include statement in MS Access. You can import code modules from other Access databases. You can call functions contained in a DLL. But there is no include in Access like there is in PHP.
DoCmd - You'll have to use MS Access's DoCmd object to open forms and reports and perform other common tasks. Just a warning: it's frequently irrational. Long-time Access users don't think much of it but I've found these commands to have little cohesion or consistency. Let me give you an example. If you want to close a form you use this code: DoCmd.Close acForm, "frmSomeFormName" but if you want to open a form you use this code: DoCmd.OpenForm "frmName" In this example, why does opening a form get it's own OpenForm function while closing a form simply uses Close followed by a constant that tells Access you are wanting to close a form? I have no answer. DoCmd is full of this type of inconsistency. Blueclaw does a pretty good job of listing the most common DoCmd's although I don't think the examples there are exactly stellar.
References - You shouldn't need to use references very frequently. You will have to use them to enable things like DAO and ADO (see further down) or Microsoft Scripting Runtime (often used for accessing, reading, writing, etc. to files and folders). It's basically something you do once and then you forget about it.
ActiveX Controls - Probably better to try to build your project without using these. They require the same control to be installed on each computer that will run your software. I don't know much about it but I understand there are some compatibility issues that can come up if you use ActiveX controls in your project.
DAO - Data Access Objects - DAO is Access's original, native set of objects used to interface to your data container. Although it is primarily used to access date held in an Access database backend/container, it also can be used for some tasks when you are using ODBC linked tables. DAO is very helpful when you need to loop through recordsets to make changes in bulk. You can even use it to loop through form controls. One place I use this is to reorder line numbers in invoice details after a line gets deleted. Another typical use is to use it in "utility" functions where you need to change something in a given field or fields that can't be done with an update query.
CurrentDb.Execute("Update or Delete query here...") The Execute method of the CurrentDb object is, in my understanding, an implicit DAO call. It allows you to run Update or Delete queries on local and linked tables from VBA code. You can also achieve this using DoCmd.RunSQL but CurrentDb.Execute is the preferred method because it gives you improved error messages if something fails if you append ", dbFailOnError" as a second argument.
ADO - ActiveX Data Objects - I recommended not using ActiveX controls but this is one ActiveX technology you might need. To my knowledge, ADO is the only thing you can use to run stored procedures from Access. ADO is similar to DAO and was supposed to replace DAO although it didn't really. I tend to use both of them in my applications. It takes a while to figure out which one will do the job for you or which one will do it better. In general, I stick with DAO for everything except for running stored procedures or connecting to outside data sources (i.e. not using linked tables). DAO and ADO are both part of MDAC (Microsoft Data Access Components) which gets installed with MS Access.
File System Object - This object, mentioned above, is often used to access files and folders. You'll find you may have to use it to copy files, create text files, read text files, write to text files, etc. It's a part of Microsoft Scripting Runtime which is part of Windows Script Host (exists on all Windows computers although it can become "broken"). Access does give you some ways of access files and folders using VBA's built-in functions/methods such as Dir() but these functions don't cover all the bases.
SQL - Server's Query Language - You're probably familiar with SQL already but you'll need to get used to Access's "superset" of the SQL language. It's not drastically different but Access does allow you to use Access functions (e.g Len, Left, right) or your own custom functions. Your own functions just need to exist in a code module and be declared as public. An example of your own function would be Repeat (doesn't exist in MS Access, exists in MySQL) which is sometimes used to create indentation based on Count(*) in tables with child parent relationships. I'm giving that as an example although it's unlikely you'll need to use such a function unless you are going to be using the Nested Set Model to hold hierarchal categories.
Variables Cannot be in Literal Strings - This is a massive difference between Access and PHP. PHP lets you write: "SELECT * FROM tag WHERE tagtext = '$mytag'" In MS Access you'd have to write it like this: "SELECT * FROM tag WHERE tagtext = '" & strMyTag & "'" (You may not ever need to worry about this unless you are formatting a query in VBA to retrieve a DAO or ADO recordset. What I've just pointed out doesn't generally affect your form's or report's recordsource or saved queries because you generally don't use variables in those.)
Query - Not difficult to figure out but in Access a Query is basically a MySQL view. I actually don't save queries very often. I generally use them only to derive my SQL "code" and then I take that SQL and paste it into my form as the Recordsource instead of binding a form to a saved query. It doesn't matter which way you want to do it. There are pros and cons either way you choose to do this. As a side note, don't be afraid to create views in MySQL and link to them in Access. When you link to them Access sees them as tables. Whether or not it is updateable/writeable will depend on the construction of the view. Certain types of queries/views (such as unions) are read-only.
As a final note, I recommend MS Access over OpenOffice.org Base. I tried out Base a couple years ago and I found it to lack so many features. However, I was already experienced in MS Access so I'm not sure that I gave OpenOffice Base a fair trial. What I found missing was events. I'm accustomed to being able to fine-tune my forms in MS Access to give users a very responsive UI with lots of feedback and I couldn't figure out how to do this in Base. Maybe things have changed since since I last tried it, I don't know. Here's an article comparing Base to MS access.
Other SO Access gurus, feel free to point out any errors in my answer. I still consider myself a rookie in programming.
I can't speak for Base. However Access can link to the MySQL database directly so you don't have to redo the data. As far as creating the bits and pieces of code in Access that would be quite easy. Access, Word and Excel, use VBA which is identical, except for Access, Word or Excel object model specific stuff, to Visual Basic 6.0. Indeed a minor obscure bug when using the VBA editor is also present in the VB6 editor.
I will also add that one of my Access databases had 160 tables, 1200 queries, 350 forms, 450 reports and 70K lines of code. So your app is quite small by comparision.
On the freebies table I would change the is_redeemed field to a date_redeemed. I definitely agree with recording each stamp and and freebie earned as separate records in tables. Thss way it's real easy to show the customer a history rather than just stating you've only got x stamps.
Also consider a bar code reader and issueing the users bar coded plastic wallet cards. This will greatly speed up the time required by the clerk to look up their records. Indeed consider using common to your area loyalty cards they might already have such as a Safeway or AirMiles card. I'd put that number in a separate table though just in case they lose the first card they were given. Or so they can track multiple cards. A family might want want to accumulate points onto one account.
Thanks for the lenghty posting. This enables us to give you some suggestions on different facets you might not have thought of in the first place.
My suggestion: don't do it. Run a mysql server on the PC in question, have your PHP app as the front end for the cashiers, and then if you want MS Access's reports feature, just have Access connect to the mysql database with ODBC.
The best implementation is quite frequently the one you already have.
I am looking for a way to import a datatable from Access into an Excel variable and then run queries through this variable to speed up the process. I am trying to migrate from C# .NET where I read a data table from an access database into memory and then used LINQ to query this dataset. It is MUCH faster than how I have it currently coded in VBA where I must make lots of calls to the actual database, which is slow. I have seen the QueryTable mentioned, but it appears that this requires pasting the data into the excel sheet. I would like to keep everything in memory and minimize the interaction between the Excel Sheet and the VBA code as much as possible.
I wish we didn't need to use Excel+VBA to do this, but we're kind of stuck with that for now. Thanks for the help!
I don't know of anything like LINQ for VBA.
If you keep the ADO Connection option in scope by making it Public, you can Excecute Commands against it. It's not as fast as LINQ, but it's definitely faster than creating and destroying Connection objects for every call.
If the tables aren't too huge, I tend to read the tables into custom classes in VBA with the appropriate Parent/Child relationships set up. The very obvious downside to this is that you can't use SQL to get a recordset of data from your classes. I have to use a lot of looping when I need more than one specific record. And that means if you have 1m records, it would be quicker to call the database.
If you're interested in the last one, you can read some of the stuff I've written on it here
http://www.dailydoseofexcel.com/archives/2008/12/07/vba-framework/
http://www.dailydoseofexcel.com/archives/2008/11/15/creating-classes-from-access-tables/
http://www.dailydoseofexcel.com/archives/2007/12/28/terminating-dependent-classes/ (read Rob Bruce's comment)
I would just read it into an ADO recordset, then get the data I need from the recordset as I need it. Of course this will depend on the size of the table you want to read.
I have a large dataset, say 1,000,000,000 rows, that lives on a server. I need a user to be able to consume (i.e. "run queries upon") that data seamlessly, over the web, from within Access and/or Excel. Additionally, I need to filter the data on the server-side according to the user connected to it.
My current approach is to create a webservice that looks like an ODBC data source and connect to it from Excel.
Questions:
Is this the best way?
If so, what's the best way to create a custom ODBC data source?
I really thing that it is not the best way. I don't know your scenario, but I really would prefer another approach.
There is a discussion about that: Creating a custom ODBC driver
One of the suggestions was using BI approach.