I have two large tables of data that I have been playing around with to try and minimize computing resources. The two tables roughly have the following information:
1) Freight Billing information
InvoiceNum | TrackingNum | Weight | BilledAmt | IncentiveCredit
2) ERP Generated Invoice Information
InvoiceNum | Weight | ProductRev | FreightRev | TaxAmt | DiscountAmt
I'm conducting an analysis between what the freight company is charging per order (Freight Cost) compared to what we charge customers per order (Freight Revenue). Additionally, I'm hoping to compare reported weights for both the freight company and our company to audit accuracy.
There will only be one instance of the invoice in the ERP data, but there can be multiple lines in the freight data with the same invoice number. A simple SumIF column would work well if the tables weren't both approximately 100,000 lines. I know that working with Arrays is generally much faster than working directly with ranges, so I started there.
Option Explicit
Sub compareFreightData()
Application.ScreenUpdating = False
'Declare Sheets
Dim sht As Worksheet
Dim billingSht As Worksheet
Set systemSht = ThisWorkbook.Sheets("SystemData")
Set billingSht = ThisWorkbook.Sheets("UPSData")
'Declare Arrays
Dim billingArray As Variant
Dim systemArray As Variant
'Declare timer
Dim t As Long
t = Timer
'Add additional header info
systemSht.Range("K3").Value = "BilledTotal"
systemSht.Range("L3").Value = "IncentiveCredits"
systemSht.Range("M3").Value = "OrderWeight"
'Put info into arrays
upsArray = billingSht.ListObjects("UPSCSVFiles").DataBodyRange.Value
systemArray = systemSht.Range("A4:J" & systemSht.Cells(systemSht.Rows.Count,"A").End(xlUp).Row).Value
'ReDim to have the three additional columns in the array
ReDim Preserve systemArray(1 To UBound(systemArray, 1), 1 To UBound(systemArray, 2) + 3)
'****************************CALCULATE THE SUMIF*******************
'
'
'Timer
Debug.Print "Timer", Timer - t
Some ideas I had were the following:
If there is a way to sum all the items in a filtered array then print the amount. (I think Filter() only works with single dimension arrays)
Loop through billing array, find the location in system array using MATCH, then add to system array (this seems very slow)
Mostly, I believe there is some easier way that I am missing and any input or ideas would be greatly appreciated.
Related
I want to create a counter in a routine that will count how many times a specific entry has appeared so far.
The routine that i have created so far populates data in a spreadsheet through a For..Next Loop. For each of these rows i have an extra column that will represent the counter and count how many times a characteristic of the entry row has appeared so far in the previous rows. For that, I am using the application.worksheetfunction.CountIf function but the reference range has to be dynamic.
For example, I have the following table
Example Table
the overall idea is to group by month and expense type and have the sum amount. The role of the counter is to identify these rows that can be grouped together and loop through their values and sum them. The table has approximately 10,000 rows and 53 columns. For this process, i have created the following public type:
>public type OP
>>Month as string
>>expense_type as string
>>amount as double
>end type
Sub NewOuput()
with sheet1
>for i=1 lastrow 'output is the existing table that i get the data and i want to manipulate and then populate them into another table of the same format
>>op.month=output(i,1)
>>op.expense_type=output(i,2)
>>op.amount=output(i,3)
'----------------------------
>> .cells(i,1)=op.month 'this is the population of hte data in the new table
>> .cells(i,2)=op.expense_type
>> .cells(i,3)=op.amount
next i
end with
end sub
Through functions, i try to identify the rows that need to sum-up and then call the respective functions in the output part of the loop.
Countif excel function cannot be appied with arrays, so this is now out of hte question. I have read many posts on various ways of grouping including data connections, collections and other customised approaches. Collections appeared to be the best ones but i miss some of hte background there.
Does this make any sense? Any suggestions are appreciated
I didn't actually grasp your exact needs, but since the table example image I'd go like follows:
Sub NewOuput()
With sheet1
'fill in the voids of 1st column
With .Range("A1:A" & .Cells(.Rows.Count, "B").End(xlUp).row) '<--| change "A" and "B" to your actual 1st and 2nd columns index
.SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "=R[-1]C"
.Value = .Value
End With
'more code to exploit a "full" database structure
End With
End Sub
I'm working with a database in excel. I will try to make it as simple as possible.
For example,
I have a vlookup range/array of fruits, and who likes each fruit.
Fruit - Person
1. Apple – DeShoun
2. Apple – John
3. Apple – Scott
4. Pear – Scott
5. Strawberries – John… ect
In my database I have a list of fruit and the vendor that sells it
Fruit - Vendor
1. Apple – Sprouts
2. Apple – Walmart
3. Apple – Trader Joe’s
4. Strawberries – Abel Farms
5. Banana – Sprouts
6. Pear – Sprouts…. ect
I need to be able to find the fruit “apple” within my database and create new rows of information within the database so that it looks like the following.
Fruit - Vendor - Person
1. Apple – Sprouts - DeShoun
2. Apple – Walmart - DeShoun
3. Apple – Trader Joe’s - DeShoun
4. Apple – Sprouts - John
5. Apple – Walmart – John
6. Apple – Trader Joe’s - John
7. Apple – Sprouts - Scott
8. Apple – Walmart - Scott
9. Apple – Trader Joe’s – Scott
10. Strawberries – Abel Farms - John
11. Banana – Sprouts - #N/A
12. Pear – Sprouts - Scott
Since I will be working on a minimum of 1000+ rows, I need to know if there’s there a process to expedite this in any way.
Does anyone have any suggestions or links/articles that can point me in the right direction?
Feel free to comment or ask any questions that could help lead to a good answer.
Thanks
Let's say your Fruit-Person table is Table 2, Fruit-Vendor is table 3. Fruit is the common field across tables here. You will need to build a Table 1 with unique values from Fruit column. (There are many ways of building a table with unique values, if you aren't aware, they should be available online)
I am listing the process for Excel-2013, there is a chance it might be slightly different in the older versions.
Step 0:
You have 3 tables as per earlier description.
Step 1:
Convert all of them to Tables one by one.
Alt+N>>T, or, select A1:A5 >> Insert >> Table. Tick Choose My Table has Headers.
Repeat this process for all 3 tables. They should look like this:
Step 2:
Create Pivot Table on Multiple Ranges
A) Create Pivot table on Table 1 (Insert>>PivotTable). Tick check "Add this data to Data Model". IMP
B) Under Pivot Table fields, ALL; you should see all 3 tables
Step3:
Create Relationships
In the Analyze tab, click Relationships. A box which says Manage relationships should open up. The idea is to build relationships.
A) Try building a relationship between Table 1 and Table 2.
New >> choose following options:
Table: Table 2
Column (Foreign): Fruit
Related Table: Table 1
Related Column (Primary): Fruit
B) Let's try building it now between Table 1 & 3
New >> choose following options:
Table: Table 3
Column (Foreign): Fruit
Related Table: Table 1
Related Column (Primary): Fruit
It should look like this:
Step 4:
Forming the Pivot
A) Get Fruit from Table 1, Person from Table 2, Vendor from Table 3 (in that order) as row labels
B) Now, Table2/Fruit and Table3/Fruit need to go as Value Labels.
The table so formed is your almost final table. The rows you want will be the ones which have a 1 in column D and E both. You can get those rows off by filtering/pasting as values.
(As a process, pasting images isn't the popular method it seems, but I couldn't have explained it better visually without them)
I'm pretty new to VBA, but had a bit of a fiddle around and this seems to work sort of as you describe (as a potential example...). Have put each table of data on a separate worksheet.
Sub FruityPerson_Matching()
Dim strFruit As String, strPerson As String, strVendor As String 'to hold text.
Dim myWB As Workbook, myWS_P As Worksheet, myWS_V As Worksheet, myWS_C As Worksheet
Dim LastRow As Integer, n As Integer, iNewRow As Integer
Dim rFruit As Range, checkCell As Range
Set myWB = Application.ActiveWorkbook
Set myWS_P = myWB.Worksheets("Person")
Set myWS_V = myWB.Worksheets("Vendor")
Set myWS_C = myWB.Worksheets("Combined")
LastRow = myWS_P.Cells(myWS_P.Rows.Count, "A").End(xlUp).Row
First looping through list of people, finds the first instance of their fruit within the vendors list:
For n = 2 To LastRow
strFruit = Cells(n, 1).Value
strPerson = Cells(n, 2).Value
Set rFruit = myWS_V.Range("A:B").Find(What:=strFruit, LookIn:=xlValues, _
LookAt:=xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, _
MatchCase:=False, SearchFormat:=False)
If Not rFruit Is Nothing Then
Set checkCell = rFruit 'For checking when findnext gets back to original cell.
strVendor = myWS_V.Cells(rFruit.Row, 2).Value
Add this to a new row (so that sure it is blank) in the final combined data sheet:
iNewRow = myWS_C.Range("A" & myWS_C.Rows.Count).End(xlUp).Offset(1).Row
myWS_C.Range("A" & iNewRow).Value = strFruit
myWS_C.Range("B" & iNewRow).Value = strVendor
myWS_C.Range("C" & iNewRow).Value = strPerson
Since potential multiple vendors per fruit, now looping through them for same person:
Do
Set rFruit = myWS_V.Range("A:B").FindNext(After:=rFruit)
If Not rFruit Is Nothing Then
If rFruit.Address = checkCell.Address Then Exit Do
'Shows: are back at start.
strVendor = myWS_V.Cells(rFruit.Row, 2).Value
iNewRow = myWS_C.Range("A" & myWS_C.Rows.Count).End(xlUp).Offset(1).Row
myWS_C.Range("A" & iNewRow).Value = strFruit
myWS_C.Range("B" & iNewRow).Value = strVendor
myWS_C.Range("C" & iNewRow).Value = strPerson
Else
Exit Do
End If
Loop
Else
'What do if strFruit not found...?
Exit Sub
End If
Next
End Sub
Finally moving on to next person in loop etc until reaching the last row of data.
Something like what you had in mind?
It can be difficult to get your head round at first, but I would recommend looking into the INDEX MATCH functions. Used together they can do exactly what vlookup does, but with a little understanding they are far more flexible and may well be better suited to your needs :)
http://fiveminutelessons.com/learn-microsoft-excel/how-use-index-match-instead-vlookup
Might be helpful, or google to find a tutorial that suits you
Specifically for your problem, the hardest part will be matching every vendor, person to each fruit... VBA might be necessary
Please note the edit after the original function code block
I've got this data set in Excel that I download from my company's cost management system each month. On average, this data set is around 100,000 rows with 32 columns. One of my job functions is to filter out line items that don't belong to my work group and arrange the data in the required format for a separate analysis system. Typically, I re-arrange the columns, enter a bunch of formulas into cells, and then use a series of autofilter checks to identify line items that need to be moved to other worksheets. This normally takes me about a couple of hours tops, but it's quite arduous and I'd rather automate the process to save time and reduce chances for me to make mistakes.
So I went ahead and wrote a VBA procedure that satisfies all of the requirements and everything seems to be checking out. The only problem is that the procedure itself takes about an hour to check 10,000 line items (I stopped it at that point). Wasting 10 hours watching a progress bar tick is not going to cut it. So now I'm trying to re-think how I've written this procedure to see if there's a better way (I'm certain there is).
Here's the code as it stands (I omitted a lot of code before and after the main loop for clarity, but I left comments there so you can see what happens in a 'pseudo-code' manner. The vast majority of time is spent in that loop, so it's really my main concern):
ORIGINAL FUNCTION
Function Prepare_CICTDF()
'Rename and set worksheet
wbRawFile.Worksheets("Sheet1").Name = "Excluded"
Set wsSheet = wbRawFile.Worksheets("Excluded")
'Update progress bar
status_message = "Rearranging columns in CICT Dedicated Facility. This may take several minutes."
Call Progress_Bar(current_row, status_message)
'Rearrange columns
'Omitted to shorten code block
'Create worksheet for included rows
wbRawFile.Worksheets.Add().Name = "Self Service"
'Copy header row to other worksheets
wsSheet.Rows("4").Copy Destination:=Sheets("Self Service").Range("A4")
'Import Lookup List
Dim wbLookupList As Workbook
Set wbLookupList = Workbooks.Open("\\server\path\to\file\Dedicated Facility Lookup List.xlsx")
Dim wsLookupList As Worksheet
Set wsLookupList = wbLookupList.Worksheets("Lookup List")
wsLookupList.Copy Before:=wbRawFile.Worksheets("Excluded")
wbLookupList.Close SaveChanges:=False
'Get first and last data row
Dim FirstRow As Long
Dim LastRow As Long
FirstRow = 5
LastRow = wsSheet.UsedRange.Rows.Count - 1
'Update progress bar
status_message = "Preparing rows in CICT Dedicated Facilty."
Call Progress_Bar(current_row, status_message)
'Loop through the rows to add formulas
Dim NextBlankRow As Long
Dim RowDeleted As Boolean
Dim i As Long
i = FirstRow
'-------------------------LOOP STARTS HERE-------------------------
Do While i <= LastRow
RowDeleted = False
'Add "CICTDF" before project ID
wsSheet.Range("B" & i).Value = "CICTDF" & wsSheet.Range("B" & i).Text
'Add formula for "Total Impact" column in column T
wsSheet.Range("T" & i).FormulaR1C1 = "=IF(AND(RC[-10]=""Complete"",RC[7]=""Manual Part Number Line Item""),RC[5],IF(AND(RC[-10]=""Complete"",RC[5]=0),0,IF(RC[-10]=""Complete"",RC[5]/RC[-5]*RC[4],RC[5])))"
'Add formula for rows with blank "Cost Impact - Part" column
If wsSheet.Range("V" & i).Value = "" Then
wsSheet.Range("V" & i).FormulaR1C1 = "=IF(RC[-7]>0,RC[3]/RC[-7]*-1,0)"
End If
'Change GLOBAL SUPPLY NETWORK to GLOBAL PURCHASING
If wsSheet.Range("F" & i).Value = "GLOBAL SUPPLY NETWORK" Then
wsSheet.Range("F" & i).Value = "GLOBAL PURCHASING"
End If
'Change numbers stored as text back to numbers
wsSheet.Range("M" & i).NumberFormat = "General"
wsSheet.Range("M" & i).Value = wsSheet.Range("M" & i).Value
wsSheet.Range("P" & i).NumberFormat = "General"
wsSheet.Range("P" & i).Value = wsSheet.Range("P" & i).Value
wsSheet.Range("AB" & i).NumberFormat = "General"
wsSheet.Range("AC" & i).NumberFormat = "General"
wsSheet.Range("AD" & i).NumberFormat = "General"
wsSheet.Range("AE" & i).NumberFormat = "General"
'Insert Cab Part # Formula
wsSheet.Range("AB" & i).Formula = "=VLOOKUP(M" & i & ",'Lookup List'!A:A,1,FALSE)"
'Insert Cabs DC formula
wsSheet.Range("AC" & i).Formula = "=VLOOKUP(N" & i & ",'Lookup List'!B:B,1,FALSE)"
'Insert Cab Localization HEX & MG Formula
wsSheet.Range("AD" & i).Formula = "=VLOOKUP(B" & i & ",'Lookup List'!C:C,1,FALSE)"
'Insert Already in MOASS formula
wsSheet.Range("AE" & i).Formula = "=VLOOKUP(B" & i & ",'Lookup List'!D:D,1,FALSE)"
'Include part numbers that match the inclusion criteria
If wsSheet.Range("AB" & i).Text <> "#N/A" And wsSheet.Range("AC" & i).Text = "#N/A" And wsSheet.Range("AD" & i).Text = "#N/A" _
And wsSheet.Range("AE" & i).Text = "#N/A" And wsSheet.Range("P" & i).Value = "14" Then
NextBlankRow = Worksheets("Self Service").UsedRange.Rows.Count + 1
wsSheet.Rows(i).Copy Destination:=Worksheets("Self Service").Range("A" & NextBlankRow)
wsSheet.Rows(i).Delete
RowDeleted = True
End If
'Check if the row was included or not
If RowDeleted = True Then
LastRow = LastRow - 1
Else
i = i + 1
End If
'Update the progress completion
current_row = current_row + 1
Call Progress_Bar(current_row, status_message)
Loop
'-------------------------LOOP STOPS HERE-------------------------
'Autofilter header row in Self Service tab
Worksheets("Self Service").Range("B4:AG4").AutoFilter
'Save as new file format
Worksheets("Self Service").Select
wbRawFile.SaveAs Filename:=output_directory & "CICT 2014 Dedicated Facility.xlsx", FileFormat:=51
wbRawFile.Application.DisplayAlerts = True
wbRawFile.Close SaveChanges:=False
End Function
Basically I loop through all of the lines in the file. For each line, I enter the formulas and values that I need and then check if they satisfy the inclusion requirements. If they do, I move the line to the "Self Service" worksheet, delete the line from the "Excluded" worksheet, and move on to the next line.
After running the first 10,000 lines of data, the elapsed time was just over 58 minutes. I think most of this can be attributed to the copy/paste/delete processes at the tail end of the loop. I've read that a common suggestion is to work within arrays instead of manipulating cells/rows/ranges in Excel, but I'm not exactly sure how I would go about doing this.
----------Edit:----------
After some input from Ron Rosenfeld, I revisited my process a little bit and made a bunch of changes. In the end, the new procedure processes and prepares over 100,000 rows (of 32 columns) in just over 49 minutes. The original procedure would have taken over 9.75 hours, so the changes have resulted in a procedure that's over 10x faster than its predecessor. Rather than paste the entire code block again, I'll describe the procedure in "pseudo-code":
Rearrange columns (takes the raw server download and puts it in the order I need).
Create a new worksheet for the included rows. Note that for my purposes, I process over 100,000 rows but end up keeping only about 10,000. Thus, I made the decision to look for those that I would INCLUDE instead of those that I would EXCLUDE.
Enter formulas in first row of data and drag down the column. I used Ron's suggestion of e.g. Range("A" & FirstRow & ":A" & LastRow") = "=B1+C1" for any columns that I could.
There was one column that only needed formulas if the cell was blank. So I used the SpecialCells(xlCellTypeBlanks) method to enter these.
AutoFiltered the data so that only the rows I wanted to include were visible. Again I used the SpecialCells(xlCellTypeVisible) method to find these and stored them in an array. This array was then entered into the new worksheet.
Finally, I did a little bit of massaging format-wise to make sure everything looked consistent (since storing the values in the array lost the cell formats).
It should also be noted that I think Tim's suggestion of using SQL in this scenario could be a very efficient alternative--I simply wasn't versed well enough on the topic to try it out. I'll be looking for ways to use it in the future, though!
Thanks everyone for the help!
Without knowing the exact layout of your worksheets, it is hard to say. In general, with regard to values, the process of reading a large DB into an array; looping through the array to decide what rows/items to keep, the writing that back to a new sheet, is usually at least an order of magnitude (10x) faster than looping through rows. Sometimes a challenge is to figure out how big the results array needs to be. If that cannot be done with some simple formulas, I have taken the interim step of gathering each row into a collection before dimensioning the results array.
Another thought after looking at your code: why not just filter on the values for columns 15, 27-30, and then copy/paste the visible cells to your new worksheet.
After you write the data to the worksheet; you can select all the blanks in the range with the SpecialCells method and write the formula that way:
R.columns(X).SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "= RC[1] + RC[2]".
To get the size of arrIncluded, it seems you could either use CountIfs; or you could add the desired rows to a Collection, then use the Count property to get the size of arrIncluded; and write the Collection to the array. I prefer the Collection method, but test to see which way is faster.
Alright, I've been racking my brain, reading up excel programming for dummies, and looking all over the place but I'm stressing over this little problem I have here. I'm completely new to vba programming, or really any programming language but I'm trying my best to get a handle on it.
The Scenario and what my goal is:
The picture below is a sample of a huge long list of data I have from different stream stations. The sample only holds two (niobrara and snake) to illustrate my problem, but in reality I have a little over 80 stations worth of data, each varying in the amount of stress periods (COLUMN B).
COLUMN A, is the station name column.
COLUMN B, stress period number
COLUMN C, modeled rate
COLUMN D, estimated rate
What I have been TRYING to figure out is how to make a macro that will loop through the station names (COLUMN A) and for each UNIQUE Group of station names, make a chart that will pop out to the right of the group, say in the COLUMN E area.
The chart is completely simple, it just needs two series scatterplot/line chart; one series with COLUMN B as x-value and COLUMN C as y-value; and the other series needs COLUMN B as x-value and COLUMN D as y-value.
Now my main ordeal, is that I don't know how to make the macro distinguish between station names, use all the data relating to that name to make the chart, then looping on to the next Station group and creating a chart that corresponds for that, and to continue looping through all 80+ station names in COLUMN A and to make the corresponding 80+ charts to the right of it all in somewhere like the COLUMN E.
If I had enough points to "bounty" this, I would in a heartbeat. But since I do not, whoever can solve my dilemma would receive my sincere gratitude in helping me understand run this problem smoothly and hopefully better my understanding of scenarios like this in the future. If there is anymore information that I need to clarify to make my question more understandable please comment your query and I'd be happy to explain in more detail the subject.
Cheers.
Oh, and for extra credit; now that I think about it, I manually entered the numbers in COLUMN B. Since the loop would need to use that column as the x-value it would be important if it could loop through itself and fill that column on its own before it made the chart (I would imagine it would have something to do with anything as simple as "counting out the rows that correspond to the station name". But again, I know not the proper terminology to correspond the station name, hence the pickle I'm in; however if the veteran programmer who is savvy enough to answer this question could, I'd imagine such a piece of code would be simple enough yet crucial to the success of such a macro I seek.
Try this
Sub MakeCharts()
Dim sh As Worksheet
Dim rAllData As Range
Dim rChartData As Range
Dim cl As Range
Dim rwStart As Long, rwCnt As Long
Dim chrt As Chart
Set sh = ActiveSheet
With sh
' Get reference to all data
Set rAllData = .Range(.[A1], .[A1].End(xlDown)).Resize(, 4)
' Get reference to first cell in data range
rwStart = 1
Set cl = rAllData.Cells(rwStart, 1)
Do While cl <> ""
' cl points to first cell in a station data set
' Count rows in current data set
rwCnt = Application.WorksheetFunction. _
CountIfs(rAllData.Columns(1), cl.Value)
' Get reference to current data set range
Set rChartData = rAllData.Cells(rwStart, 1).Resize(rwCnt, 4)
With rChartData
' Auto fill sequence number
.Cells(1, 2) = 1
.Cells(2, 2) = 2
.Cells(1, 2).Resize(2, 1).AutoFill _
Destination:=.Columns(2), Type:=xlFillSeries
End With
' Create Chart next to data set
Set chrt = .Shapes.AddChart(xlXYScatterLines, _
rChartData.Width, .Range(.[A1], cl).Height).Chart
With chrt
.SetSourceData Source:=rChartData.Offset(0, 1).Resize(, 3)
' --> Set any chart properties here
' Add Title
.SetElement msoElementChartTitleCenteredOverlay
.ChartTitle.Caption = cl.Value
' Adjust plot size to allow for title
.PlotArea.Height = .PlotArea.Height - .ChartTitle.Height
.PlotArea.Top = .PlotArea.Top + .ChartTitle.Height
' Name series'
.SeriesCollection(1).Name = "=""Modeled"""
.SeriesCollection(2).Name = "=""Estimated"""
' turn off markers
.SeriesCollection(1).MarkerStyle = -4142
.SeriesCollection(2).MarkerStyle = -4142
End With
' Get next data set
rwStart = rwStart + rwCnt
Set cl = rAllData.Cells(rwStart, 1)
Loop
End With
End Sub
please help me make my app a little faster, it's taking forever to loop through and give me results right now.
here is what im donig:
1. load gridview from an uploaded excel file (this would probably be about 300 records or so)
2. compare manufacturer, model and serial No to my MS SQL database (about 20K records) to see if there is a match.
'find source ID based on make/model/serial No combination.
Dim cSource As New clsSource()
Dim ds As DataSet = cSource.GetSources()
Dim found As Boolean = False
'populate db datatables
Dim dt As DataTable = ds.Tables(0)
Dim rows As Integer = gwResults.Rows.Count()
For Each row As GridViewRow In gwResults.Rows
'move through rows and check data in each row against the dataset
'1 - make
For Each dataRow As DataRow In dt.Rows
found = False
If dataRow("manufacturerName") = row.Cells(1).Text Then
If dataRow("modelName") = row.Cells(2).Text Then
If dataRow("serialNo") = row.Cells(3).Text Then
found = True
End If
End If
End If
'display results
If found Then
lblResults.Text += row.Cells(1).Text & "/" & row.Cells(2).Text & "/" & row.Cells(3).Text & " found"
Else
lblResults.Text += row.Cells(1).Text & "/" & row.Cells(2).Text & "/" & row.Cells(3).Text & " not found "
End If
Next
Next
is there a better way to find a match between the two? i'm dying here.
For each of your 300 gridview rows, you are looping through all 20k datarows. That makes 300 * 20k = 6 million loop iterations. No wonder your loop is slow. :-)
Let me suggest the following algorithm instead (pseudo-code):
For Each gridviewrow
Execute a SELECT statement on your DB with a WHERE clause that compares all three components
If the SELECT statement returns a row
--> found
Else
--> not found
End If
Next
With this solution, you only have 300 loop iterations. Within each loop iteration, you make a SELECT on the database. If you have indexed your database correctly (i.e., if you have a composite index on the fields manufacturerName, modelName and serialNo), then this SELECT should be very fast -- much faster than looping through all 20k datarows.
From a mathematical point of view, this would reduce the time complexity of your algorithm from O(n * m) to O(n * log m), with n denoting the number of rows in your gridview and m the number of records in your database.
While Heinzi's answer is correct; it may be more beneficial to carry out the expensive SQL query before the loop and filter using data views so you aren't hitting the DB 300 times
Execute a SELECT statement on your DB
For Each gridviewrow
if my datagridview.Select(String.format("manufacturerName={0}", row.item("ManufacturerName"))
If the dataview has a row
--> found
Else
--> not found
End If
Next
NOTE: I only compared a single criteria to illustrate the point, you could filter on all three in here
Hmm... how about loading the data from the spreadsheet into a table in tempdb and then writing a select that compares the rows in the way that you want to compare them? This way, all of the data comparisons happen server-side and you'll be able to leverage all of the power of your SQL instance.