Cell.cross() returns error in Google Refine projects - google-refine

I'm trying to create a new column based on my main project's Date column that pulls timeline events from another Google Refine project:
cell.cross("Clean5 Timeline", "TimelineDate").cells["TimelineEvent"].value[0]
The dates are in the same format in both Google Refine projects. But it fills no cells, and I get this error:
Error: Cannot retrieve field from null
This — 
cell.cross("Clean5 Timeline", "TimelineDate")
— returns [ ] for rows where there should be a match.
And this —
cell.cross("Clean5 Timeline", "TimelineDate").cells["TimelineEvent"]
— returns null for those rows.
I copied the syntax directly from the GREL help files: http://code.google.com/p/google-refine/wiki/GRELOtherFunctions. Can anyone suggest what I may be overlooking?
Thanks.

Without access to your projects it's going to be difficult to answer this, but the first thing I'd suggest is that you trim back your expression to find out exactly where the null is coming from.
Since
cell.cross("Clean5 Timeline", "TimelineDate")
is returning an empty array ([]), nothing based on that result is going to work.
There are three possible problems that I can think of: 1) the project name is wrong, 2) the column name is wrong, 3) the data values don't match (or Refine doesn't think they do), or 4) you are running into a caching bug with cross() that exists in Refine 2.5.
Restarting the Refine server should clear the cache if you're running into the bug and it's also fixed in the current source repository. The fix will be included in OpenRefine 2.6.

Related

Numeric value 'abc_0011O00001y31VpQAI' is not recognized in Snowflake

(Opening the following on behalf of a Snowflake client...)
When I try to insert into the table it threw below error:
Numeric value 'abc_0011O00001y31VpQAI' is not recognized
Have check the table DDL and found only 3 columns defined as NUMBER and rest as VARCHAR.
I checked the SELECT query and didnot find any string value in those NUMBER Datatype columns. Also tried searching in all the Varchar columns for the value 'abc_0011O00001y31VpQAI' , I didn't find any
I know one thing Snowflake doesn't always shows correct error. Am I missing anything here? Is there any way to fix it?
Both COL4_MRR and COL5_QUANTITY are NUMBER
INSERT INTO TABLE
(COL1_DATE, COL2_CD, COL3_CUST_NAME, COL3_LOC_NAME,COL4_MRR,COL5_QUANTITY)
SELECT
'2019-10-03' AS COL1_DATE ,
'AE' AS COL2_CD
,CUSTOMER_NAME AS COL3_CUST_NAME
,LOCATION_NAME AS COL3_LOC_NAME
,MRR_BILLED as COL4_MRR
,QTY_BILLED as COL5_QUANTITY
FROM SCHEMA.V_TABLEA
union all
SELECT
'2019-10-03' AS COL1_DATE ,
'BE' AS COL2_CD
,CUSTOMER_NAME AS COL3_CUST_NAME
,LOCATION_NAME AS COL3_LOC_NAME
,NULL as COL4_MRR
,QTY_BILLED as COL5_QUANTITY
FROM SCHEMA.V_TABLEB
I created a table_D same as original TABLE and tried inserting into it , it worked fine . Then Inserted into Original TABLE from table_D , it worked again .
Deleted those rows from original TABLE and reran the job , it worked fine.
There was no issue with data as all was Number only, I even tried with TRY_TO_NUMBER too. It inserted the data without any changes to the code.
...............
Client is currently waiting on a next day run to re-test to determine if this is either a bug or an issue with their data. In the meantime, we are interested to see if anyone else has run into similar challenges and have a viable recommendation. THANK YOU.
The error typically means you are trying to insert non-numeric data (like 'abc_0011O00001y31VpQAI') into a numeric column. It seems like the customer did everything right in testing and TRY_TO_NUMBER() is a great way to verify numeric data.
Do the SELECT queries run fine separately? If so, then I would check whether there might be a potential mismatch in the datatype of the columns and make sure they are in the right order.
I would also check whether or not the header is being skipped in the file (that may be where the 'abc_0011O00001y31VpQAI' is coming from since the customer did not see it in the data).
SELECT queries work fine, I tried creating a new table with same DDL as original and tried loading into that new table, it worked fine. Not sure why it is not loading into the original table

"?" character in MSSQL DB getting replaced with (capital A with grave accennt) when displayed by ASP script

I'm attempting to provide support for a legacy ASP/MSSQL web application - I wasn't involved in the development of the software (the company that built it no longer exists) & I'm not the admin of the server where it's hosted, I just manage the hosting for the owners of the site via a reseller account. I'm also not an ASP developer (more a PHP guy), and am not that familiar with it beyond the basics - updating DB connection strings after server migrations, etc.
The issue is that the site in question stores the content of individuals pages in an MSSQL database, and much of the content includes links. Almost all of the internal links on the site are format like "main.asp?123" (with "123" being the ID of a database row). The problem is, starting sometime in the last 8 months or so*, something caused the links in the DB content to show up as "main.aspÀ123" instead - in other words, the "?" character is being replaced by the "À" character (capital A with grave accent). Which, of course, breaks all of those links. Note that Stackoverflow won't allow me to include that character in the post title, because it seems to think that it indicates I'm posting in Spanish...?
(*unfortunately I don't know the timing beyond that, the site owners didn't know when the issue started occurring, so all I have to go by is an archive.org snapshot from last October, where it was working)
I attempted to manually change the "?" character in one of the relevant DB records to "?" (the HTML entity for the question mark), but that didn't make any difference. I also checked the character encoding of the HTML code used to display the content, but that doesn't seem to be the cause either - the same ASP files contain hard-coded links to some of the same pages (formatted exactly the same way), and those work correctly: the "?" doesn't get replaced.
I've also connected to the database directly with the MSSQL Management Studio Express application, but couldn't find any charset/character encoding options for either the database or the table.
And I've tried contacting the hosting provider, but they (M247 UK, in case anyone is curious) have been laughably unhelpful. The responses from them have been along the lines of "durrrrrr, we checked a totally different link that wasn't actually the one that you clearly described AND highlighted in a screenshot, and it works when we check the wrong link, so the problem must be resolved, right?" Suffice it to say, I wouldn't recommend them - used to be a customer of RedFox hosting, and the quality of customer has dropped off substantially since M247 bought them.
Any suggestions? If this were PHP/MySQL, I'd probably start by creating a small test script that did nothing but fetch one of the relevant records and display it's contents, to narrow down the issue - but I'm not familiar enough with ASP to do that here, at least not without a fair amount of googl'ing (and most of the info I can find is specific to ASP.net instead).
Edit: the thread suggested as a solution appears to be for character encoding issues when writing to MSSQL, not reading from it - and I've tried the solutions suggested in that thread, none make any difference.
Looks like you're converting from UNICODE to ASCII somewhere along the line...
Have a look at this to get a quick demo of what happens. In particular, pay attention to the ascii derived from inr, versus the ascii derived from unicode...
SELECT
t.n,
ascii_char = CHAR(t.n),
unicode_char = NCHAR(t.n),
unicode_to_ascii = CONVERT(varchar(10), NCHAR(t.n))
FROM (
SELECT TOP (1024)
n = ROW_NUMBER() OVER (ORDER BY ao.object_id)
FROM
sys.all_objects ao
) t
WHERE 1 = 1
--AND CONVERT(varchar(10), NCHAR(t.n)) ='À'
;
I found a workaround that appears to do the trick: I was previously trying to replace the ? in the code with &#63 (took out the ; so that it will show the code rather than the output), which didn't work. BUT it seems to work if I use &quest instead.
One thing to note, it seemed that I was originally incorrect in thinking that the issue was only affecting content being read/displayed from the MSSQL DB. Rather, it looks like the same problem was also occurring with static content being "echo'd" by code in the ASP scripts (I'm more of a PHP guy, not sure the correct term is for ASP's equivalent to echo is). Though the links that were hardcoded as static (rather HTML being dynamically output by ASP) were unaffected. Though chancing the ? to &quest worked for those ones too (hardest part was tracking down the file I needed to edit).

Grouping tablix by Paramater is SSRS (#Error) and automatically adding (0) when selected in expression

I hope this is not a stupid question, but I have searched everywhere and have tried everything.
I have a dashboard and would like to group the tablix (The dashboard is inside the tablix) by one of the Parameters (Consultant). There are a few Data sets(queries) in the report and all of the Parameters are filtered with IN in the where clause.
The problem I have is that when I go to the row group properties and select the Parameter in the expression, then it automatically adds a (0) at the end. If I take the (0) away then I get the error message:
the group expression used in grouping 'Group1' returned a data type
that is not valid
I know the (0) is for getting the first value, but I am using Multi-valued Parameters.
I have tried one thing I found, but unfortunately it didn't work for me (SSRS Group By Parameter).
Edited:
This is to show you that there are multiple Data Sets(Queries) in this report
I have the dashboard in a tablix so that I can group for each Consultant, so when I choose 3 Consultant, I get 3 dashboards.
Expression used:
Then I get this error:
I have also tried using the CStr, but also no luck.
When I add the Parameter in any expression box it automatically put the (0) as below:
But then it doesn't use the parameter as I get an #Error where is should be the Consultant name.
I also used this option for page break but end up with graphs below each other:
This is what happens to the Charts(Sub Reports)
To give you an idea how the dashboard should look for each Consultant.
Regarding the other question I saw. I just tried exactly as they said but also no luck
I hope this isn't too much information. Just trying to help you help me.
Thank you!
UPDATE:
Parameter Properties:
Have you tried using the list tool to separate the sub reports by Consultant? A list acts like a container and will create whatever is inside for your grouping. You should also be able to apply a parameter to the list for filtering.

Detect when (Select All) is checked for multi value parameter

I have a report with a multi-valued parameter on it. I'm looking to output the selected values which is accomplished with Join(Parameters!State.Label,",")
Every solution I've found on the web indicates I should use something like the following to detect when the (Select All) "value" is selected.
E.g. expression for the text box on the header should be:
="State: " & IIF(countrows("prc_prompt_state").Equals(Parameters!State.Count),"(All)",join(Parameters!State.Label,","))
CountRows() tells me the total number of parameters available, e.g. 8 states in Australia. Parameters!State.Count is supposed to tell me how many are actually selected by the user. However this always reports the full value (8 in this case) regardless of how many are selected. This is in agreement with the official docs (https://technet.microsoft.com/en-us/library/aa337293(v=sql.100).aspx), but NOT in agreement with every single search result I come up with on how to solve this problem.
So how can I rewrite this expression so I can find out when (Select All) is/isn't checked? I'm using report builder 3, which I believe is based on the 2008 edition - we deploy to Azure, but I haven't got that far yet.
Examples of questions whose answers seem to be wrong:
Displaying Multi-Value Parameters
SSRS: Can I know if user selected "ALL" in multivalued param?
This is old, but google found it for me, and then I figured out an answer on my own that worked. (I was using a list of users.)
I created a separate dataset that returns a count of all available options in the default parameter lookup (username). Then, I assigned that as a default value to an internal parameter. (UserCount) This worked as a text expression:
=Microsoft.VisualBasic.Interaction.IIF(Parameters!username.Count = Parameters!UserCount.Value, "All Selected", Microsoft.VisualBasic.Strings.JOIN(Parameters!username.Value, ", "))

How to retrieve the name of a file and store it in the database using SSIS package?

I'm doing an Excel loop through fifty or more Excel files. The loop goes through each Excel file, grabs all the data and inputs it into the database without error. This is the typical process of setting delay validation to true, and making sure that the expression for the Excel Connection is a string variable called EFile that is set to nothing (in the loop).
What is not working: trying to input the name of the Excel file into the database.
What's been tried (edit; SO changed my 2 to 1 - don't know why):
Add a derived column between the Excel file and database input, and add a column using the EFile expression (so under Expression in the Derived Column it would be #[User::EFile]). and add the empty. However, this inputs nothing a blank (nothing).
One suggestion was to add ANOTHER string variable and set its properties EvaluateAsExpression to True and set the Expression to the EFile variable (#[User::EFile]). The funny thing is that this does the same thing - inputs a blank into the database.
Numerous people on blogs claim they can do this, yet I haven't seen one actually address this (I have a blog and I will definitely be showing people how to do this when I get an answer because, so far, these others have fallen short). How do I grab an Excel file's name and input it in a database during a loop?
Added: Forgot to add, no scripts; the claim is that it can be done without them, so I want to see the solution without them.
Note: I already have the ability to import the data from the Excel files - that's easy (see my GitHub account, as I have two different projects for importing all sorts of txt, csv, xls, xlsx data). I am trying to also get the actual name of the file being imported also into the database. So, if there are fifty Excel files, along with the data in each file, the database will have the fifty file names alongside that data (so if each file has 1000 rows of data, each 1000 rows would also have the name of the file they came from next to them as an additional column). This point seems to cause a lot of confusion, as people assume I'm having trouble importing data in files - NOPE, see my GitHub; again that's easy. It's the FILENAME that needs to also be imported.
Test package: https://github.com/tmmtsmith/SSISLoopWithFileName
Solution: #jaimet pointed out that the Derived Column needed to be the #[User::CurrentFile] (see the test package). When I first ran the package, I still got a blank value in my database. But when we originally set up the connection, we do point it to an actual file (I call this "fooling the package"), then change the expression on the connecting later to the #[User::CurrentFile], which is blank. The Derived Column, using the variable #[User::CurrentFile], showed a string of 0. So, I removed the Derived Column, put the full file path and name in the variable, then added the variable to the Derived Column (which made it think the string was 91 characters long), then went back and set the variable to nothing (English teacher would hate the THENs about right now). When I ran the package, it inputted the full file path. Maybe, like the connection, it needs to initially think that a file exists in order for it to input the full amount of characters?
Appreciate all the help.
The issue is because of blank value in the variable #[User::FileNameInput] and this caused the SSIS package to assume that the value of this variable will always be of zero length in the Derived Column transformation.
Change the expression on the Derived column transformation from #[User::FileNameInput] to (DT_STR, 2000, 1252)#[User::FileNameInput].
Type casting the derived column to 2000 sets the column length to that maximum value. The value 1252 represents the code page. I assumed that you are using ANSI code page. I took the value 2000 from your table definition because the FilePath column had variable VARCHAR(2000). If the column data type had been NVARCHAR(2000), then the expression would be (DT_WSTR, 2000)#[User::FileNameInput]
Tim,
You're using the wrong variable in your Derived Column component. You are storing the filename in #[User::CurrentFile] but the variable that you're using in your Derived Column component is #[User::FileNameInput]
Change your Derived Column component to use #[User::CurrentFile] and you'll be good.
Hope that helps.
JT
If you are using a ForEach loop to process the files in a folder then I have have used the technique described in SSIS Junkie's blog to get the filename in to an SSIS variable: SSIS: Enumerating files in a Foreach loop
You can use the variable later in your flow to write it to the database.
TO all intents and purposes your method #1 should work. That's exactly how I would attempt to do it. I am baffled as to why it is not working. Could you perhaps share your package?
Tony, thanks very much for the link. Much appreciated.
Regards
Jamie

Resources