Return empty cell instead of 0 in Google Sheets when data being displayed is an array from another sheet - arrays

I have a Google Sheet workbook with multiple sheets being used to track COVID-19 cases in institutions across the country. The built-in Google Sheets geo chart works perfectly for the data visualization I need to accomplish, with one issue: It currently can't differentiate between actual 0 and "no data", which super skews how the
(essentially you can choose what color to use on the map for high value, mid value, low value, and no value. Where it should be using the color for "no value", it uses the low value color instead which makes the visualization confusing.)
The reason it's doing that is the array it's using as its data source contains zeroes to represent "no data available".
The array is imported from a different sheet by using ={'State Totals'!N4:P54}. I found an explanation for how to generally use a formula to return empty cells, the example there being =if(B2-C2>0, B2-C2, " ").
I'm extremely noob when it comes to these formulas, and I cannot figure out if I can nest an IF condition in an array import, or vice versa, or... what or how.
Here's a link to the sheet in question, if that helps at all. Really I just need a formula that
Imports the array values
Returns empty cells in place of zeroes where they appear
I don't want to affect the origin sheet's zero handling, just the one that the chart's using. (I also am absolutely not being paid enough to try and rig up a better map with Google Data Queries instead of the in-built Google Sheets chart maker, so here's to hoping it's a simple matter of syntax.)

instead of ={'State Totals'!N4:P54} use:
=ARRAYFORMULA(IF('State Totals'!N4:P54=0,,'State Totals'!N4:P54))

Related

Get value of same cell from multiple sheets

I'd like to get the value of cell A1 from multiple tabs in my sheet. I attempted to use INDIRECT, only to be reminded that it wouldn't work in an ARRAYFORMULA. Is there some sort of "makeshifty" formula that would do the same thing?
This is what I originally attempted:
=ArrayFormula(IF(LEN(A2:A),INDIRECT(A2:A&"!A1"),))
Column A is a select list of the tab names in my sheet. So the first instance works, but of course, it doesn't populate down the column as I had hoped. I realize I can just copy the formula down the column, but some type of an ARRAYFORMULA would be ideal as I add rows to the list.
I found this answer, but don't see how I could apply it to my situation.
I also found this answer, but thought since it's 2.5 years old, maybe someone has discovered a clever way to avoid the drag of copying.
Answer:
You need to do this with a script or by using the drag method - INDIRECT uses a string reference and so can't be used with an array.
More Information:
Unfortunately for the user of INDIRECT with ARRAYFORMULA, a discovery of a clever method isn't the issue - the limitation of what can be done with only formulae that is the root of this problem.
Setting up a custom function:
From the Tools > Script editor menu item, you can create scripts. An example custom formula that you could use would be as follows:
function ARRAYINDIRECT(input) {
var ss = SpreadsheetApp.getActiveSpreadsheet();
return input.reduce(function(array, el) {
if (el[0]) array.push(el[0])
return array;
}, []).map(x => ss.getSheetByName(x).getRange("A1").getValue());
}
Make sure to then save the script with the save icon.
In your Sheet you can then call this custom formula as so:
=ARRAYINDIRECT(A2:A)
Rundown of the function:
Takes the input range from where the formula is called - in this case A2:A
Reduces the input to remove all cells that are empty
Maps the name of the sheet which is stored in the cell to the value in A1 of the sheet it references
Returns an array to output
References:
JavaScript Array reduce() Method
INDIRECT - Docs Editors Help
ARRAYFORMULA - Docs Editors Help

macro for automatic refresh function on array

I am creating a dynamic reporting tool that creates reports from data sourced from Wonderware. The data that is sourced is gathered from various pumps/flows/temps around site for operators/management to use. I want to create a dynamic sheet rather than use the wizards available because of limited IT experience of some of the operators.
I have managed to create the report but have one issue that i cannot resolve that would help the sheet become more user friendly.
I have some array formulas that link to cells that have dropdowns. (This is what helps make it user friendly). The drop down cells include, which server to look at, which tagname to look for, the start time, the duration and the number of cells in the array.
When changing the number of cells in the array cell dropdown the array doesnt change until you select a cell in within the array and then select the Refresh Function command. This then changes the array.
I want to write a macro that will select several cells on the sheet that have individual arrays and select Refresh Function command. I will then assign this to a shape that can quickly and easily be selected.
Can anyone help with this macro please?
You just need to add the reference to ActiveFactoryWorkbook in visual basic editor, and then something like this:
Range("B11").Activate
ActiveFactoryWorkbook.wwRefreshFunction
Be sure that in the cell B11 you will have a part of the array the query generates. As you have to refresh more than one array just copy the code again and change the cell reference.
Sub Workbook_RefreshAll()
ActiveWorkbook.RefreshAll
End Sub

Array constant in a formula with non-adjacent cell references

I need to add an array of non-adjacent cells to my array formula. I have tried all of the following array constant-like ways and they all give me a "There is a problem with this formula error".
'Chart Data'!{A12:A14,D3:D11}
{'Chart Data'!A12:A14,'Chart Data'!D3:D11}
'Chart Data'!{A12,A13,A14,D3:D11}
{'Chart Data'!A12,'Chart Data'!A13,'Chart Data'!A14,'Chart Data'!D3:D11}
'Chart Data'!{A12,A13,A14,D3,D4,D5,D6,D7,D8,D9,D10,D11}
{'Chart Data'!A12,'Chart Data'!A13,'Chart Data'!A14,'Chart Data'!D3,'Chart Data'!D4,'Chart Data'!D5,'Chart Data'!D6,'Chart Data'!D7,'Chart Data'!D8,'Chart Data'!D9,'Chart Data'!D10,'Chart Data'!D11}
Entire formula (the array constant goes where the {#####} is):
{=SUM(((1-References!M1:M12)*({#####}*(G3:G14+F3:F14-0.11)))+((References!M1:M12)*('Chart Data'!A12:A23*(G3:G14+F3:F14-0.11)))+((H2:H13*X3:X14)+(H3:H14*Y3:Y14)+(I2:I13*(V3:V14-X3:X14))+(I3:I14*(W3:W14-Y3:Y14))))}
I am 100% positive that it is this particular array constant that is causing the problem. I can't move the cells I'm referencing to put them in line. Is it even possible to reference a non-adjacent range in an array formula? If it's possible, what am I doing wrong?
There are several ways to do this. The following is very simple and pretty direct so my favorite.
EITHER choose a cell to build your string for your non-contiguous array in OR create a Named Range to do it. I'll show the first as it seems nicest for being able to use the mouse freely, but in both of them you can actually be creative using about how you build the string that will become your array. The main advantage of creating it in a Named Range is no helper cell lying about anywhere.
So, you create that string and then make it an array. Say you have a non-contiguous array needed using cells A12:A14 and C3:C11. You use joining and TEXTJOIN() like so:
="{"&TEXTJOIN(",",FALSE,B12:B14,C3:C11)&"}"
to create a text string of the values in those cells wrapped with the curly braces ({}) just as if you'd typed it in ("hardcoded it"). It will look like this with the right values in those cells:
{1,2,3,1,2,3,4,5,6,7,8,9}
but is ain't an array yet.
Now the magic in THIS method. Create a Named Range, perhaps called String2Array, and give it a formula of:
=EVALUATE(A1)
(or whatever cell you used for the above formula creating the text string that you want to be an array). Make the reference absolute. ($A$1... which it will do for you, just don't edit it to be relative. If you use this for similar work, but need it relative, that will work fine, but it just isn't what is needed here.)
Now replace your placeholder in the formula with the Named Range's name (perhaps you DID use String2Array). And you're done.
A couple other methods use INDEX() or CHOOSE() and you can force things to be arrays using the functions DOLLARDE() and IMREAL() (I found on a helpsite in a 2014 post) and some others do the same kind of thing. In those days, one had to use {CSE} too, but SPILL takes care of that now (with those two weird-seeming friendlies and at least two others). The poster was someone I've seen on this site, EXCELXOR was the name for the site, XOR LX was the name of the member here though the functions were mentioned in a comment by a Lori. Since he covers, it seems, aspects not usually covered in helpsites, looking up some of his work here, or elsewhere too, might be worthwhile to some folks.
But this method is very direct and therefore easy to maintain. And personally, I love the idea that EVALUATE() (must be used IN the Named Range functionality, not cell-side) is the gift that keeps on giving, one wonderfully helpful thing after another.
So many ways. You could even literally build the array in a helper column/row somewhere and reference THAT instead of the non-contiguous addresses. I like the joining+TEXTJOIN() approach best because I can use the mouse to easily get all the blocks into the formula since it is a LIVE formula. But you can type out a string fairly easily too and add the {}'s. Or perhaps a user would type a string of addresses and you'd add them like the formula does above. And you can insert actual values (constants) into the string you are building as well if that is appropriate. And you could build it formulaicly... I wouldn't pick that workload first thing off the pile of choices, but if you were going to do it anyway already, then... or if it's a small build.

Issue reading .txt file in Matlab. I want to get an array from this file without the unnecessary info it contains

I'm having trouble reading a .txt file to get some data. I want to be able to change some of the values this data contains.
At first, I used this:
A=importdata('myfile.txt');
And got this cell array:
Now what I want is this:
1) Get rid of the headers (the information from cell 1 to 22). That could be easily done by simple indexing (creating a sub-array using just the info from cell 23 to the end of the file).
2) I want to separate the information into different cells, using these identifiers. But I don't know how to separate them into different cells of the array.
'# item bepoch ecode label onset diff dura b_flags a_flags enable bin'
3) Do the same in step 2 to fill those columns with the data in the rest of the cells.
I'm trying to use this approach, but I'm not getting the expected results.
If someone can help me, I'd be glad.
Have you tried dragging the file into the variable workspace window and using the data import wizard? It has some nice features that normally take care of what you are trying to do automatically. Unfortunately, it seems that your text file may have unconventional spacing, but Matlab may be able to handle it if you set the delimeter to ' ' or suchlike.

Excel VBA Programming with Arrays: To Pass them or Not To Pass them?

Question: I am wondering which is the optimal solution for dealing with Arrays in Excel 2003 VBA
Background: I have a Macro in Excel 2003 that is over 5000 lines. I have built it over the last 2 years adding new features as new Procedures, which helps to segment the code and debug, change, or add to that feature. The downside is that I am using much of the same base information in multiple procedures, which requires me to load it into arrays with minor differences multiple times. I am now running into issues with the length of run time, so I am now able to do a full rewrite.
This file is used to grab multiple items of manufacturing flows (up to 4 different set ups with a total of up to 10 distinct flows , of up to 1000 steps each) with the information being Flow specific, Sub-Flow specific for grouping / sorting purposes, and Data (such as movements, inventory, CT, ...)
It then will stick the data onto multiple sheets used to manage the process utilizing data sheets to be perused, charts, and Cell Formatting to denote process flow capability / history.
The Flow is in the Excel File, while the Manufacturing data is read in with 7 different OO4O Oracle SQL pulls, some reused multiple times
The Arrays are:
arrrFlow(1 to 1000, 1 to 4) as a Record Type with 4 strings
arrrSubFlow(1 to 1000, 1 to 10) as a Record Type with 4 strings, 2 integers, and 1 single
arrrData(1 to 1000, 1 to 10) as a Record Type with 1 string, 4 integers, 12 longs, and 1 single
arriSort(1 to 1000, 1 to 4) as Integer (Used as a pointer Array to sort the Flow, Sub Flow, and Data in a Group, Sub Group, and Step order while leaving the original arrays in Step order)
Possibilities:
1) Rewrite the macro into one big procedure that loads the data into master arrays dimensioned within the Procedure once
Pro: Dimensioned in the Procedure rather than as a Public Variable in the Module and not passed.
Con: Harder to debug with one mega procedure instead of multiple smaller ones.
2) Keep macro with multiple procedures but passing the Arrays
Pro: Easier to debug code with multiple smaller procedures.
Con: Passing Arrays (Expensive?)
3) Keep macro with multiple procedures but with the Arrays being Public Dim'ed variables in the Module
Pro: Easier to debug code with multiple smaller procedures.
Con: Public Arrays (Expensive?)
So, what's the community's verdict? Does anyone know the expense of using Public Arrays vs Passing Arrays? Is the Cost of either of these worth losing the ease of having my procedures being focused on one feature?
UPDATE:
I load Inventory Data at a discrete level (multiple per Step), Moves Data at a aggregate level (one per step), and the Beginning of Shift Inventory at an aggregate level. I aggregate the Inventory data by step placing it in Work State categories (Run, Wait,...) I create targets off data already on the sheets.
I have a Flow sheet that shows the Work Flows by Type, currently 3 products have a similar but not exactly the same flow, and 2 products are a different flow, that are similar but again not the same as each other. I have assigned each set of steps in the different flows a group and sub-group.
I place this data on multiple sheets, some in Step Order, some in group / sub-group order. I also need the data summed up by group and product, group / sub-group and product, portion of the line and product, and product.
I use Record Types so I actually have a readable three dimensional array, arrSubFlow(1,1).strStep (Step Name of the 1st Step of the 1st Device), arrData(10,5).lngYest (Yesterday's movement for the 10th Step of the 5th Device).
My main point of optimization is going to be in the section where I create 10 pages from scratch every single time. With Merging Cells, Borders, Headers, ... This is a very time consuming process. I will add a section that will compare my data with the page to see if it needs to be changed and if so, only then recreate it otherwise, I'll clear each section of data and only write data that changes to the sheet. This will be huge, based on my time logging data. However, whenever I update code, I always try to improve other aspects of the code as well. I see the loading of the data into a Structure (Array, RecordSet, Collection) once as both a little bit of optimization, but more so for data integrity, so I do not have the opportunity to load it differently for different sheets.
The main issues I see getting away from Arrays right now are:
* Already heavily invested in them, but this is not a good enough reason to not change
* Don't know if there is much cost to passing them, since it will by ByRef
* I use a Sort Function to create a Sorted "Pointer" array that lets me leave the Array in Step Flow order, while easily referencing it by Group / Sub-group order.
Since I am always trying to make my code for now and the future, I am not against updating the arrays to either RecordSets or Collections, but not merely for the sake of changing them to learn something cool. My arrays work and from my research, they add seconds to the run time, not substantial amounts for this 2 minute report. So If another structure is easier to update in the future than Two-dimensional Arrays of Record Types, then please let me know, but does anyone know the cost of passing an Array to a procedure, assuming you are not doing a ByVal pass?
You've provided a good bit of detail, but it's still quite difficult to understand exactly what's going on without seeing some code. In your question, I can identify at least 4 big topics that you interweave throughout: Manufacturing, Data Access, VBA, and Coding Best-Practices. It's hard for me to tell exactly what you're asking because your question scope is huge. Either way, I appreciate your trying to write better code in VBA.
It's hard for me to understand exactly what you plan to do with the arrays. You say:
The downside is that I am using much of the same base information in multiple procedures, which requires me to load it into arrays with minor differences multiple times.
I'm not sure what you mean here. Are you using arrays to represent a row of data that you retrieved from a database? If so, you might consider using class modules instead of the usual "macro" modules. These will allow you to work with full-blown objects instead of arrays of values (or references, as the case may be). Classes take more work to set up and consume, but they make your code a lot easier to work with and will greatly help you to segment your code.
As user Emtucifor already pointed out, there may be objects such as ADO Recordset objects (which may require Access to be installed...not sure) that can help greatly. Or you might create your own.
Here's a long example of how using a class might help you. Although this example is lengthy, it will show you how a few principles of object-oriented programming can really help you clean up your code.
In the VBA editor, go to Insert > Class Module. In the Properties window (bottom left of the screen by default), change the name of the module to WorkLogItem. Add the following code to the class:
Option Explicit
Private pTaskID As Long
Private pPersonName As String
Private pHoursWorked As Double
Public Property Get TaskID() As Long
TaskID = pTaskID
End Property
Public Property Let TaskID(lTaskID As Long)
pTaskID = lTaskID
End Property
Public Property Get PersonName() As String
PersonName = pPersonName
End Property
Public Property Let PersonName(lPersonName As String)
pPersonName = lPersonName
End Property
Public Property Get HoursWorked() As Double
HoursWorked = pHoursWorked
End Property
Public Property Let HoursWorked(lHoursWorked As Double)
pHoursWorked = lHoursWorked
End Property
The above code will give us a strongly-typed object that's specific to the data with which we're working. When you use multi-dimension arrays to store your data, your code resembles this: arr(1,1) is the ID, arr(1,2) is the PersonName, and arr(1,3) is the HoursWorked. Using that syntax, it's hard to know what is what. Let's assume you still load your objects into an array, but instead use the WorkLogItem that we created above. This name, you would be able to do arr(1).PersonName to get the person's name. That makes your code much easier to read.
Let's keep moving with this example. Instead of storing the objects in array, we'll try using a collection.
Next, add a new class module and call it ProcessWorkLog. Put the following code in there:
Option Explicit
Private pWorkLogItems As Collection
Public Property Get WorkLogItems() As Collection
Set WorkLogItems = pWorkLogItems
End Property
Public Property Set WorkLogItems(lWorkLogItem As Collection)
Set pWorkLogItems = lWorkLogItem
End Property
Function GetHoursWorked(strPersonName As String) As Double
On Error GoTo Handle_Errors
Dim wli As WorkLogItem
Dim doubleTotal As Double
doubleTotal = 0
For Each wli In WorkLogItems
If strPersonName = wli.PersonName Then
doubleTotal = doubleTotal + wli.HoursWorked
End If
Next wli
Exit_Here:
GetHoursWorked = doubleTotal
Exit Function
Handle_Errors:
'You will probably want to catch the error that will '
'occur if WorkLogItems has not been set '
Resume Exit_Here
End Function
The above class is going to be used to "do something" with a colleciton of WorkLogItem. Initially, we just set it up to count the total number of hours worked. Let's test the code we wrote. Create a new Module (not a class module this time; just a "regular" module). Paste the following code in the module:
Option Explicit
Function PopulateArray() As Collection
Dim clnWlis As Collection
Dim wli As WorkLogItem
'Put some data in the collection'
Set clnWlis = New Collection
Set wli = New WorkLogItem
wli.TaskID = 1
wli.PersonName = "Fred"
wli.HoursWorked = 4.5
clnWlis.Add wli
Set wli = New WorkLogItem
wli.TaskID = 2
wli.PersonName = "Sally"
wli.HoursWorked = 3
clnWlis.Add wli
Set wli = New WorkLogItem
wli.TaskID = 3
wli.PersonName = "Fred"
wli.HoursWorked = 2.5
clnWlis.Add wli
Set PopulateArray = clnWlis
End Function
Sub TestGetHoursWorked()
Dim pwl As ProcessWorkLog
Dim arrWli() As WorkLogItem
Set pwl = New ProcessWorkLog
Set pwl.WorkLogItems = PopulateArray()
Debug.Print pwl.GetHoursWorked("Fred")
End Sub
In the above code, PopulateArray() simply creates a collection of WorkLogItem. In your real code, you might create class to parse your Excel sheets or your data objects to fill a collection or an array.
The TestGetHoursWorked() code simply demonstrates how the classes were used. You notice that ProcessWorkLog is instantiated as an object. After it is instantiated, a collection of WorkLogItem becomes part of the pwl object. You notice this in the line Set pwl.WorkLogItems = PopulateArray(). Next, we simply call the function we wrote which acts upon the collection WorkLogItems.
Why is this helpful?
Let's suppose your data changes and you want to add a new method. Suppose your WorkLogItem now includes a field for HoursOnBreak and you want to add a new method to calculate that.
All you need to do is add a property to WorkLogItem like so:
Private pHoursOnBreak As Double
Public Property Get HoursOnBreak() As Double
HoursOnBreak = pHoursOnBreak
End Property
Public Property Let HoursOnBreak(lHoursOnBreak As Double)
pHoursOnBreak = lHoursOnBreak
End Property
Of course, you'll need to change your method for populating your collection (the sample method I used was PopulateArray(), but you probably should have a separate class just for this). Then you just add your new method to your ProcessWorkLog class:
Function GetHoursOnBreak(strPersonName As String) As Double
'Code to get hours on break
End Function
Now, if we wanted to update our TestGetHoursWorked() method to return result of GetHoursOnBreak, all we would have to do as add the following line:
Debug.Print pwl.GetHoursOnBreak("Fred")
If you passed in an array of values that represented your data, you would have to find every place in your code where you used the arrays and then update it accordingly. If you use classes (and their instantiated objects) instead, you can much more easily update your code to work with changes. Also, when you allow the class to be consumed in multiple ways (perhaps one function needs only 4 of the objects properties while another function will need 6), they can still reference the same object. This keeps you from having multiple arrays for different types of functions.
For further reading, I would highly recommend getting a copy of VBA Developer's Handbook, 2nd edition. The book is full of great examples and best practices and tons of sample code. If you're investing a lot of time into VBA for a serious project, it's well worth your time to look into this book.
It sounds like maybe Excel and arrays are not the best tools for the job you're doing. If you could please explain a little bit about the type of data that you're working with and what you're doing, that will really help provide a better answer. Give as much detail as you can about the types of manipulations you're doing on the data and what the inputs and outputs are.
I'm going to give some highlights that I think will help you, and then may edit my answer to be more complete as I get responses from you, and so I have more time to flesh things out a bit.
There is an object that naturally handles the record-type objects you're working with called a Recordset. In the VBA editor, go to Tools -> References and add Microsoft ActiveX Data Objects 2.X Library (the highest one on your machine). You can declare an object of type ADODB.Recordset, then do Recordset.Fields.Append to add fields to it, then .Open it and finally .AddNew, set field values, and .Update. This is a natural object to pass around in programs as an input or output parameter. It has natural traversal and positioning functions (.Eof, .Bof, .AbsolutePosition, .MoveNext, .MoveFirst, .MovePrevious) and supports searching and filtering (.Filter = "Field = 'abc'", .Find and so on).
I don't recommend using public variables, though without an understanding of what you're doing I can't really advise you well here.
I also would avoid one big procedure. Code should be broken out into reusable functional units that do only one thing, whose names are essentially self-documenting about what they do.
If you want to improve the performance of your code, hit ctrl-break at random times while it's running and break into the code. Then press Ctrl-L to view the call stack. Make a note of what is in the list each time. If any item shows up a majority of the time, it is the bottleneck and is where you should spend your time trying to optimize it. However, I don't advise trying to optimize what you have until you make some higher-level decisions (like whether you will switch to a recordset).
I really need more information to help you better.
If you're interested, I'll work up some demonstration code that will show how useful the Recordset object is. Inserting the data from a Recordset into an Excel range is super easy with Recordset.GetRows or .GetString (though some array transposition may be required, that's not hard, either).
UPDATE: If your goal is to speed up your process, then before doing anything I think it's best to be armed with the knowledge of what is taking the most time. Would you please hit ctrl-break about 10 times and note down the call stack each time, then tell me what the most common items in the call stack are?
In terms of updating the speed of cell formatting, here's my experience:
Merge is the slowest operation you can possibly do. Try to avoid it if at all possible. Using "center across selection" is one alternative. Another is just not merging, but using some combination of sizing properly, borders, cell background color, and turning off gridlines for the entire workbook.
Apply borders or other formatting once to the largest thing possible instead of to many small things such as cell by cell. For example, if most cells have all borders but some don't, then apply all borders to the entire range and during your looping remove the ones you don't want. And even then, try to do entire rows and larger ranges.
Save a template file with borders and formatting already applied. Let's say you put one row in it with the formatting for a certain section. In one step duplicate that row into as many rows are needed for that section, say 20 rows, and they will all have the same formatting. Duplicating rows is MUCH faster than applying formatting cell by cell.
Also, I wouldn't automatically go for using classes. While OO is great and I do it myself (heck, I just built 8 classes for something the other day to model a hierarchical structure so I could easily expose the parts of it when I needed them), in practice it can be slower. A simple set of public variables in a class is faster than using getters and setters. A user defined Type is even faster than a class, but you can run into gotchas trying to pass around UDTs in classes (they have to be declared in a non-class public module and even then they can give problems).

Resources