I'm a self taught Excel VBA and SQL user. I'm testing out some simple queries before I add complexity. I must be missing something blindingly obvious here...
I am using an ADO connection to run a SQL SELECT statement on a table in the activeworkbook (ThisWorkBook). The Excel Table is named "tbl_QDB" and is on worksheet "MyQDB". The table starts on cell A1, so there are no blank or populated cells above the Table HeaderRowRange.
I have set up an ADO connection to ThisWorkBook and this is working fine. Here's the code:
Sub ConnectionOpen2()
'### UNDER DEVELOPMENT
Dim sconnect As String
Const adUseClient = 3
Const adUseServer = 2
Const adLockOptimistic = 3
Const adOpenKeyset = 1
Const adOpenDynamic = 2
'used to connect to this workbook for SQL runs
On Error GoTo err_OpenConnection2
Set cn2 = CreateObject("ADODB.Connection")
Set rec2 = CreateObject("ADODB.Recordset")
rec2.CursorLocation = adUseClient
rec2.CursorType = adOpenStatic
rec2.LockType = adLockOptimistic
datasource = ThisWorkbook.FullName
sconnect = "Provider=Microsoft.ACE.OLEDB.12.0;" & _
"Data Source=" & datasource & ";" & _
"Extended Properties=""Excel 12.0;HDR=YES;ReadOnly=False;Imex=0"";"
cn2.Open sconnect
'etc, etc...
End Sub
I can run this simplest basic SELECT query:
SQLSTR="SELECT * FROM [MYQDB$]"
rec2.open SQLSTR, cn2
This works and produces 10 records i.e. rec2.recordcount=10.
However, if I try this, it errors:
SQLSTR="SELECT QID_1 FROM [MYQDB$]"
QID_1 is a valid field in the table on worksheet "MyQDB".
It doesn't change the error if I enclose QID_1 in () or [] or ``
I can even replace the field name with a made up field e.g. DonaldDuck and I get the same error.
Why would the SELECT statement work if I use "*" but not if I use any of the field names in the table? This seems so basic that I feel I must have missed a simple but key point.
I really will appreciate if someone can point out the mistake!
The SQL should work - if the field exists. Execute the Select * and dump the field list:
For i = 0 To rec2.Fields.Count - 1
Debug.Print rec2.Fields(i).Name
Next i
Thank you all for your comments.
That suggestion #FunThomas was an eye opener! The results were F1, F2, F3 etc, so the field names (or column names if you prefer) were not being recognised.
This would explain why, after days of trying to join this table with another in a closed, external workbook, it was not working. SQL error messages can be quite obtuse and were not saying it didn't recognise the field name.
I have now fixed that issue. Here's what I can tell / warn others:
I started this table with rows above the header. In 2 of those cells
above I recorded the last connection time and status to another
workbook table. I realised before that these extra rows, with data
populated in ANY cell above the headers, were causing problems with
SQL. Despite having my data in an Excel Table, the SQL "engine" for
Excel looks at the sheet, i.e. [MYQDB$] where the data is stored
(although I am aware that you can specify a sheet and range, but
cannot use the actual table name as the range).
It is ok to have blank rows above the table headerrowrange. So, I
deleted the cells containing the data above the table
headerrowrange. Instead, I placed a Text Box and used a formula to
look at another sheet where the last connection time and status were
now stored to supply the text for the text box.
I can now see that even this Text Box, which occupies no cell, causes a problem for Excel SQL.
Before posting my question here, I made a copy of the workbook and removed the text box and the rows above the table headerrowrange. I still got errors. I still got F1, F2, F3 etc as field names (per #FunThomas's suggestion).
Only after deleting these rows and the text box and then resizing the table (actually, the same range as before) did the Excel SQL recognise the proper field names. I was then even able (just for curiosity) to insert a blank row above the table headerrowrange, and the SQL still worked.
It seems to me that Excel retained in memory the old table definition and only by removing all data above the table headerrowrange and then resizing the table did it refresh that. Perhaps I should be less lazy in future and call the sheetname and range (table address) in the sql: maybe that would ignore data in cells above the headerrowrange?
#PanagiotisKanavos: I was originally trying to compare two tables (actual Excel Tables, not just ranges, hence they have Field Names), one in ThisWorkBook and another in a closed Excel workbook. SQL is the best way to do this. Having failed to get a left join to work between these tables (and this Question might now have revealed why that wouldn't work!) I decided to bring the data from the external workbook into ThisWorkBook and compare there. Then I was going to find the differences, store in a recordset (hence SQL) and then INSERT INTO the external workbook.
Thanks for your help guys!
Related
I have this code that lists all indexes, but I only need those where Indexed: "Yes, duplicates OK" is set. Is there any way to do that instead of manually look through all indexes? I need to migrate the data to SQL Server, but with this I only get empty tables in SQL Server.
Const adSchemaIndexes As Long = 12
Dim cn As Object ' ADODB.Connection
Dim rs As Object ' ADODB.Recordset
Dim i As Long
Set cn = CurrentProject.Connection
Set rs = cn.OpenSchema(adSchemaIndexes)
With rs
' enable next three lines to view all the recordset column names
' For i = 0 To (.Fields.Count - 1)
' Debug.Print .Fields(i).Name
' Next i
Do While Not .EOF
Debug.Print !TABLE_NAME, !INDEX_NAME, !PRIMARY_KEY
.MoveNext
Loop
.Close
End With
Set rs = Nothing
Set cn = Nothing
As written, that procedure displays only 3 of the recordset's 25 columns. If you follow the comment instruction to "enable next three lines", you can view the names of all the available columns. One column in particular (UNIQUE) should be useful for your purpose. When you choose "Yes (Duplicates OK)" for the "Indexed" property in table design, it will be displayed as False in the UNIQUE column of the schema recordset.
I assumed you're not interested in information for indexes on system tables ("MSys*" table names). And, for the non-system table indexes, only present information for those without a unique constraint. Here is how I modified the Do While loop accordingly:
Do While Not .EOF
If !Unique = False And Not !TABLE_NAME Like "MSys*" Then
Debug.Print !TABLE_NAME, !INDEX_NAME, !COLUMN_NAME, !Unique
End If
.MoveNext
Loop
Here is the output from the revised procedure in my test database:
DiscardMe compound_4_5 f4 False
DiscardMe compound_4_5 f5 False
DiscardMe f3 f3 False
That table has 2 non-unique indexes. One of them is a compound index, based on 2 fields. So the schema recordset includes separate rows for each of those fields. The other index is based on a single field, so is presented as a single row in the recordset.
I think that gives you what your question asks for. But I have no idea how those indexes would interfere with migrating your table data to SQL Server. Good luck.
Yesterday I had to run a query in MS Access 2010. One field I needed was not in the tables I usually use (already linked through the ODBC Database) and I didn't know what table it was a part (there are several hundred tables in the Machine Data Sources). Aside from manually importing all the tables and looking in each one for this field is there a way I can search for a field without knowing the table either 1. without importing any tables from the ODBC Databases, or if not 2. importing a handful of possible tables and searching once those tables have been linked into my active MS Access 2010 session?
Install Access Dependency Checker, link all tables and search for column name (enable checkbox for search in linked tables)
You could do this in a Function using ADO schema's.
Try this function in a standard module:
Function ListTablesContainingField(SelectFieldName) As String
Dim cn As New ADODB.Connection
Dim rs As ADODB.Recordset
Dim strTempList As String
Set cn = CurrentProject.Connection
'Get names of all tables that have a column called <SelectFieldName>
Set rs = cn.OpenSchema(adSchemaColumns, _
Array(Empty, Empty, Empty, SelectFieldName))
'List the tables that have been selected
While Not rs.EOF
'Exclude MS system tables
If Left(rs!Table_Name, 4) <> "MSys" Then
strTempList = strTempList & "," & rs!Table_Name
End If
rs.MoveNext
Wend
ListTablesContainingField = Mid(strTempList, 2)
rs.Close
Set cn = Nothing
End Function
Every week, my analysts have a spreadsheet of invoices which they need to update with a check number and check date. The checks table exists in SQL server.
I've written them a macro that iterates through each row of the spreadsheet, opens an ADO recordset using a statement like this:
SELECT CheckNumber, CheckDate FROM CHECKS WHERE Invoice_Number = " & cells (i,2)
... and then uses the fields from the recordset to write the number and date to the first two columns of that row in the Excel spreadsheet.
The code performs acceptably for a few hundred rows, but is slow when there are thousands of rows.
Is there a faster way to update an Excel spreadsheet than with a row-by-row lookup using ADO? For example, is there a way to do a SQL join between the spreadsheet and the table in SQL Server?
Edit: In response to Jeeped's questions, here's a bit of clarification.
What I'm really trying to do is find a way to "batch" update an Excel spreadsheet with information from SQL server, instead executing SQL lookups and writing the results a row at a time. Is there a way to do the equivalent of a join and return the entire results set in a single recordset?
The Invoice example above really represents a class of problems that I encounter daily. The end users have a spreadsheet that contains their working data (e.g. invoices) and they want me to add information from a SQL server table to it. For example, "Using the invoice number in column C, add the check number for that invoice in column A, and the check date in column B". Another example might be "For each invoice in column b, add the purchase order number to column a."
The Excel source column would be either a number or text. The "match" column in the SQL table would be of a corresponding data type, either varchar or integer. The data is properly normalized, indexed, etc. The updates would normally affect a few hundred or thousand rows, although sometimes there will be as many as twenty to thirty thousand.
If I can find a way to batch rows, I'll probably turn this into an Excel add-in to simplify the process. For that reason, I'd like to stay in VBA because my power users can extend or modify it to meet their needs--I'd rather not write it in a .NET language because then we need to dedicate developer time to modifying and deploying it. The security of the Excel application is not a concern here because the users already have access to the data through ODBC linked tables in an MS Access database and we have taken appropriate security precautions on the SQL Server.
Moving the process to SSIS would require a repeatability that doesn't exist in the actual business process.
In the past I've had success with pulling in all of the data from SQL server into a client side disconnected ADO recordset. I then looped once through the entire recordset to create a VBA dictionary storing the ID Value (in this case the InvoiceNum) as key, and the recordset bookmark as the pair item. Then loop though each value checking the invoice number against the dictionary using the "Exists" function. If you find a match you can set your recordset to the bookmark and then update the values on the spreadsheet from the recordset. Assuming the Invoice table isn't a few million rows this method should prove speedy.
EDIT: Added batch processing to try to limit returned records from large datasets. (Untested Code Sample)
Public Sub UpdateInvoiceData(invoiceNumRng As Range)
'References: Microsoft ActiveX Data Objects x.x
'References: Microsoft Scripting Runtime
Dim cell As Range
Dim tempCells As Collection
Dim sqlRS As ADODB.Recordset
Dim dict As Scripting.Dictionary
Dim iCell As Range
Dim testInvoiceNum As String
Dim inClause As String
Dim i As Long
i = 1
For Each cell In invoiceNumRng
If i Mod 25 = 0 Or i = invoiceNumRng.cells.Count Then 'break up loop into batches of 25:: Modify batch size here, try to find an optimal size.
inClause = CreateInClause(tempCells) 'limit sql query with our test values
Set sqlRS = GetInvoiceRS(inClause) 'retrieve batch results
Set dict = CreateInvoiceDict(sqlRS) 'create our lookup dictionary
For Each iCell In tempCells
testInvoiceNum = iCell.Value 'get the invoice number to test
If dict.Exists(testInvoiceNum) Then 'test for match
sqlRS.Bookmark = dict.Item(testInvoiceNum) 'move our recordset pointer to the correct item
iCell.Offset(0, 1).Value = sqlRS.Fields("CheckNum").Value
iCell.Offset(0, 2).Value = sqlRS.Fields("CheckDate").Value
End If
Next iCell
'prepare for next batch of cells
Set tempCells = Nothing
Set tempCells = New Collection
Else
tempCells.Add cell
End If
i = i + 1 'our counter to determine batch size
Next cell
End Sub
Private Function CreateInClause(cells As Collection) As String
Dim retStr As String
Dim tempCell As Range
retStr = ""
For Each tempCell In cells
retStr = retStr & "'" & tempCell.Value & "'" & ", " 'assumes textual value, omit single quotes if numeric/int
Next tempCell
If Len(retStr) > 0 Then
CreateInClause = Left(retStr, Len(retStr) - 2) 'trim off last comma value
Else
CreateInClause = "" 'no items
End If
End Function
Private Function GetInvoiceRS(inClause As String) As ADODB.Recordset
'returns the listing of InvoiceData from SQL
Dim cn As ADODB.Connection
Dim rs As ADODB.Recordset
Dim sql As String
Set cn = New ADODB.Connection
cn.ConnectionString = "Your Connection String"
sql = "SELECT * FROM [Invoices] WHERE InvoiceID IN(" & incluase & ")"
cn.Open
rs.CursorLocation = adUseClient 'use clientside cursor since we will want to loop in memory
rs.CursorType = adOpenDynamic
rs.Open sql, cn
Set rs.ActiveConnection = Nothing 'disconnect from connection here
cn.Close
Set GetInvoiceRS = rs
End Function
Private Function CreateInvoiceDict(dataRS As ADODB.Recordset) As Dictionary
Dim dict As Scripting.Dictionary
Set dict = New Scripting.Dictionary
If dataRS.BOF And dataRS.EOF Then
'no data to process
Else
dataRS.MoveFirst 'make sure we are on first item in recordset
End If
Do While Not dataRS.EOF
dict.Add CStr(dataRS.Fields("InvoiceNum").Value), dataRS.Bookmark
dataRS.MoveNext
Loop
Set CreateInvoiceDict = dict
End Function
The best way to do this is to use SSIS and insert the information (through SSIS) into a range in the spreadsheet. Remember that SSIS expects the target range to be empty and one row above the target range should also be empty. If you do this you can schedule the SSIS job through the windows scheduler.
I'd like to know, how to create a database table in Excel, so that it may be used with ODBC
I want to use ODBC, and I have two options, either MS Access or Excel,
As you probably know, in order to indicate some MS Access file or Excel file as an ODBC source, you need to follow:
Administrative Tools -> Data Sources (ODBC) -> Choose User DSN -> Choose either 'Excel Files' or 'MS Access Database' from the list -> Press 'Configure' -> finally choose the file (MS Access or Excel) as ODBC source
Well, it works fine with MS Access, I can connect to the file and see all tables that I've created inside
But when it comes to Excel, although I can connect to the file, I can't see the table that I've created inside
I just used 'Table' in 'Insert' tab, added some headers as column names, and gave the table
a meaningful name. Is that the way to do it?
There are several ways you can reference "table" data in an Excel workbook:
An entire worksheet.
A named range of cells on a worksheet.
An unnamed range of cells on a worksheet.
They are explained in detail in the "Select Excel Data with Code" section of the Microsoft Knowledge Base article 257819.
The most straightforward way is to keep the data on a separate sheet, put column names in the first row (starting in cell A1), and then have the actual data start in row 2, like this
To test, I created a User DSN named "odbcFromExcel" that pointed to that workbook...
...and then ran the following VBScript to test the connection:
Option Explicit
Dim con, rst, rowCount
Set con = CreateObject("ADODB.Connection")
con.Open "DSN=odbcFromExcel;"
Set rst = CreateObject("ADODB.Recordset")
rst.Open "SELECT * FROM [Sheet1$]", con
rowCount = 0
Do While Not rst.EOF
rowCount = rowCount + 1
If rowCount = 1 Then
Wscript.Echo "Data row 1, rst(""LastName"").Value=""" & rst("LastName").Value & """"
End If
rst.MoveNext
Loop
Wscript.Echo rowCount & " data rows found."
rst.Close
Set rst = Nothing
con.Close
Set con = Nothing
The results were
C:\Users\Gord\Documents\__tmp>cscript /nologo excelTest.vbs
Data row 1, rst("LastName").Value="Thompson"
10 data rows found.
I hope that helps your Excel connection issue.
As a final comment I have to say that if you are doing something that takes "several seconds" to do in Excel but "takes around 20-25 min" to do in Access then I strongly suspect that you are using Access in a very inefficient way, but that's a topic for another question (if you care to pursue it).
EDIT
If you want to INSERT data into an Excel workbook then that is possible, but be aware that the default setting for an Excel ODBC connection is "Read Only" so you have to click the "Options>>" button and clear that checkbox:
Once that's done, the following code...
Option Explicit
Dim con
Set con = CreateObject("ADODB.Connection")
con.Open "DSN=odbcFromExcel;"
con.Execute "INSERT INTO [Sheet1$] (ID, LastName, FirstName) VALUES (11, 'Dumpty', 'Humpty')"
con.Close
Set con = Nothing
Wscript.Echo "Done."
...will indeed append a new row in the Excel sheet with the data provided.
However, that still doesn't address the problem of no "Tables" being available for selection when you point your "sniffer" app at an Excel ODBC DSN.
One thing you could try would be to create an Excel sheet with column headings in row 1, then select those entire columns and create an Excel "Defined Name". Then, see if your "sniffer" app recognizes that as a "table" name that you can select.
FWIW, I defined the name myTable as =Sheet1!$A:$C in my Excel workbook, and then my original code sort of worked when I used SELECT * FROM [myTable]:
C:\Users\Gord\Documents\__tmp>cscript /nologo excelTest.vbs
Data row 1, rst("LastName").Value="Thompson"
1048576 data rows found.
As you can see, it retrieved the first "record" correctly, but then it didn't recognize the end of the valid data and continued to read the ~1 million rows in the sheet.
I doubt very much that I will be putting any more effort into this because I agree with the other comments that using Excel as an "ODBC database" is really not a very good idea.
I strongly suggest that you try to find out why your earlier attempts to use Access were so unsatisfactory. As I said before, it sounds to me like something was doing a really bad job at interacting with Access.
I had a similar problem with some data recently. The way I managed to get around it was to select the data as a range A1:XY12345, then use the Define Name tool to name the range. When you connect to the Excel workbook via ODBC, this named range will appear as a "table," while ranges that you actually defined (per Excel) as a table, do not.
You just need to select as many as required columns from first row of your excel file and then give a name to it on the edit box left to the formula bar. Of course you give a name to each column of the file too!
I have a database with a few multi-valued lookup fields. When i split my database, there is a repeated error that the junction table is not found. I know Access makes shadow tables when you use the lookup wizard. How do i link these tables?
I tried the following code:
Sub refresh()
Dim db As Database
Dim rs As Recordset
Set db = CurrentDb
Set rs = db.OpenRecordset("SELECT [Name] FROM [MSysObjects] WHERE ([Type] = 6);", dbOpenSnapshot, dbForwardOnly)
Do While (Not rs.EOF)
db.TableDefs.Delete rs.Fields("Name").Value
rs.MoveNext
Loop
rs.Close
Set rs = Nothing
db.Close
Set db = Nothing
End Sub
but when I ran it it still gave me the same error message, that the hidden junction table "in this case called "TblAudienceTblProg"" was not found.
Is there any way to get around this or do I have to restructure the whole back end to include the actual junction tables?
I think the multivalued datatype is only really useful when the backend is going to be in SharePoint or you do not plan to split a local database.
Basically what a multivalued field type is is a many to many relationship without the hassle of creating a bridge table yourself.
Please click here for more information