Multiple line strings in Apache Zeppelin

Multiple line strings in Apache Zeppelin - apache-zeppelin

I have a very long string that must be broken into multiple lines. How can I do that in zeppelin?
The error is error: missing argument list for method + in class String:
Here is the more complete error message:
<console>:14: error: missing argument list for method + in class String
Unapplied methods are only converted to functions when a function type is expected.
You can make this conversion explicit by writing `$plus _` or `$plus(_)` instead of `$plus`.
val q = "select count(distinct productId),count(distinct date),count(distinct instock_inStockPercent), count(distinct instock_totalOnHand)," +

In Scala (using Apache Zeppelin as well as otherwise), you can write expressions covering multiple lines by wrapping them in parentheses:
val text = ("line 1"
+ "line 2")

Using parentheses
As Theus mentioned. One way is parentheses.
val text = ("line 1" +
"line 2")
Actually all multiline statements which break by semantics can be included by parentheses. like.
(object.function1()
.function2())
Using """
For multiline string. We can use """, like this,
val s = """line 1
line2
line3"""
The leading space before line2 and line3 will be included. If we don't want to to have the leading spaces. We can use like this.
val s = """line 1
|line2
|line3""".stripMargin
Or using different strip character
val s = """line 1
$line2
$line3""".stripMargin('$')

Related

Best methods to extract substring into arguments and allow arguments with multiple "main split character" indicated by a character in Lua?

Let's say I have this string
"argument \"some argument\""
which prints out as
argument "some argument"
And now as example, let's say I would want to split it using the "space character" as the "main split character" to split, but allow me to indicate with a character which part to have multiple arguments of. So let's say it would be the quotation mark ". At the it should extract the arguments like so
> [1] = "argument"
> [2] = "some argument"
I am able to do this with the code here:
ExtractArgs = function(text,splitKey)
local skip = 0
local arguments = {}
local curString = ""
for i = 1, text:len() do
if (i <= skip) then continue end
local c = text:sub(i, i)
if (c == "\"") and (text:sub(i-1, i-1) ~= "\\") then
local match = text:sub(i):match("%b\"\"")
if (match) then
curString = ""
skip = i + match:len()
arguments[#arguments + 1] = match:sub(2, -2)
else
curString = curString..c
end
elseif (c == splitKey and curString ~= "") then
arguments[#arguments + 1] = curString
curString = ""
else
if (c == splitKey and curString == "") then
continue
end
curString = curString..c
end
end
if (curString ~= "") then
arguments[#arguments + 1] = curString
end
return arguments
end;
print(ExtractArgs("argument \"some argument\"", " "))
So what's the issue?
I am wondering for better ways. This current way, doesn't allow me to use " as it is being used to process.
Here is another way: Extract substring inbetween quotation marks, but skip \" and turn it into " instead in Lua
but I am wondering if there are even better ways.
Ways that would allow me to have something like this:
argument " argument2" "some argument" "some argument with a quotation mark " inside of it" another_argument"
to turn into something like this
> [1] = 'argument'
> [2] = 'argument2"'
> [3] = 'some argument'
> [4] = 'some argument with a quotation mark " inside of it'
> [5] = 'another_argument"'
The example that I made right now, might sound impossible because of no character really indicating what should be processed and what not.
But I am looking for better ways to extract substrings as arguments while allowing to have arguments as one argument that would normally just get split into arguments.
So if it wouldn't be for the ", \"some arguments\" would have just splitted into "some" and "arguments" instead of "some arguments".
Maybe a method that uses something like Lua does with ' " could be a way.
Because it would be probably impossible to have a perfect working system, that would turn this input """ into " as an extracted argument. I would imagine it just extracting it into this "", an empty string.
However, not if it would look like this '"'. But then the other question would be, how could you allow ''' to extract into '. I would imagine it working if it would be so "'". But this is getting too complicated.
I am wondering, is there even a better way to extract arguments, but allow certain special operation like keeping multiple arguments into one argument, by wrapping it around something, or just in any way?

Improve code with checking element in array is digit

I want to check that each element in String is digit. Firstly, I split the String to an Array by a regexp [, ]+ expression and then I try to check each element by forall and isDigit.
object Test extends App {
val st = "1, 434, 634, 8"
st.split("[ ,]+") match {
case arr if !arr.forall(_.forall(_.isDigit)) => println("not an array")
case arr if arr.isEmpty => println("bad delimiter")
case _ => println("success")
}
}
How can I improve this code and !arr.forall(_.forall(_.isDigit))?

Use matches that requires the string to fully match the pattern:
st.matches("""\d+(?:\s*,\s*\d+)*""")
See the Scala demo and the regex demo.
Details
In a triple quoted string literal, there is no need to double escape backslashes that are part of regex escapes
Anchors - ^ and $ - are implicit when the pattern is used with .matches
The regex means 1+ digits followed with 0 or more repetitions of a comma enclosed with 0 or more whitespaces and then 1+ digits.

I think it can be simplified while also making it a bit more robust.
val st = "1,434 , 634 , 8" //a little messier but still valid
st.split(",").forall(" *\\d+ *".r.matches) //res0: Boolean = true
I'm assuming strings like "1,,,434 , 634 2 , " should fail.
The regex can be put in a variable so that it is compiled only once.
val digits = " *\\d+ *".r
st.split(",").forall(digits.matches)

Getting "#ERROR" in certain fields when using Split function in SSRS

I am using a split function to separate a column with two street addresses.
The information is separated by ,.
Some of the rows only have one address associated with them.
In those rows for my Street Address 2, I'm getting #ERROR when I want it to be null.
I've tried an IIF() statement for the expression, but I am having trouble with it.
Split(Fields!Street.Value, ",").GetValue(2)

(Use a custom function for each Address.
Adapted from: Split String
Public Function GetAddress1(ByVal a as String)
Dim b() as string
b=Split(a,",")
Dim str_1(b.Length) As String
Dim i As Integer
For i = 0 To b.Length - 1
str_1(i) = b(i).Split(",")(0)
Next
return str_1
End Function
Public Function GetAddress2 (ByVal a as String)
Dim b() as string
b=Split(a,",")
Dim str_1(b.Length) As String
Dim i As Integer
For i = 0 To b.Length - 1
str_1(i) = b(i).Split(",")(1)
Next
return str_1
End Function

Unlike the If statement, IIf statements evaluate all code paths even though only one code path is used. This means that an error in an unused code-path will bubble up to an error in the IIf statement, preventing it from executing correctly.
To fix this, you need to use functions that won't throw an error when there is nothing to split.
Here is an example of code that should do what you want:
=IIf(
InStr(
InStr(
Parameters!Street.Value
, ","
) + 1
,
Parameters!Street.Value
, ","
) = 0
, Nothing
, Right(
Parameters!Street.Value
, Parameters!Street.Value.ToString().Length - (
InStr(
InStr(
Parameters!Street.Value
, ","
) + 1
,
Parameters!Street.Value
, ","
)
)
)
)
Let's break this down.
I've used a combination of InStr(), Right(), Length(), and IIf() functions to split the string without throwing an error.
InStr() is used to find the position of the string "," within the address. This returns 0 rather than an error if it can't find the string.
InStr(Parameters!Street.Value, ",")
Since you appear to be looking for the second comma in your split function you will need to nest the InStr function. Use the location of the first comma as the start location to search for the second comma. Don't forget to +1 or you will find the first comma again.
InStr(InStr(Parameters!Street.Value, ",") + 1, Parameters!Street.Value, ",")
Now you can find the second comma without throwing an error even if no commas exist.
Based on the location of the second comma use the Right() function to grab all characters to the right of the second comma. Since Right() needs to know how many characters from the end rather than from the beginning, you will need to subtract the location of the comma from the Length() of the string. This effectively splits the string at the comma.
Right(
Parameters!Street.Value
, Parameters!Street.Value.ToString().Length - (
InStr(InStr(Parameters!Street.Value, ",") + 1, Parameters!Street.Value, ","))
)
)
If you have more than 2 commas you can grab just the string between the 2nd and 3rd comma by following up with a Left() function that finds the location of the 3rd comma.
Now you can use your IIf() function to return NULL (Nothing) if there is not a 2nd comma. The function at the top shows how this all fits together.
This could be cleaned up by using functions, but the provided code shows you how it can be done.

Can't display unicode characters from file properly

I'm writing a script which should operate on words from a number of files which have unicode characters in a form of something\u0142somethingelse.
I use python 3 so I suppose after reading line \u0142 should be replaced by 'ł' character, but it isn't. I receive "something\u0142somethingelse" in console.
After manually copying "bad" output from console and pasting it to: print("something\u0142somethingelse") it is displayed correctly.
Problematic part of the script:
list_of_files = ['test/stack.txt']
for file in list_of_files:
with open(file,'r') as fp:
for line in fp:
print(line)
print("something\u0142somethingelse")
stack.txt:
something\u0142somethingelse
Output:
something\u0142somethingelse
somethingłsomethingelse
I experimented with utf-8 encoding when opening this file and really I'm out of ideas...

I think you can do what you want with ast.literal_eval. This uses the same syntax as the Python interpreter to understand literals: like eval but safer. So this works, for example:
a = 'something\\u0142somethingelse'
import ast
b = ast.literal_eval('"' + a + '"')
print '"' + a + '"'
print b
The output should be:
"something\u0142somethingelse"
somethingłsomethingelse

even two string are same but when compare result are coming false

I am comparing two string.I am reading String 1 i.e expectedResult from excelsheet and String 2 i.e actualResult i am getting from web page by using " getElementByXPath("errorMsg_userPass").getText();
but when i equate two string even though they are same result of comparison are coming false i.e they are not same.
enter image description here
I don't know why it is happening like this .Please Help

use trim() to remove leading and trailing spaces!!

I recommend you looking at the exact bytes of the actual and expected strings. There might be for instance an unbreakable space instead of a regular space and then they will look the same but won't be the same for equals.
You can see the difference by running the following snippet:
String a = new String("a\u00A0b");
String b = new String ("a b");
System.out.println(a + "|" + Arrays.toString(a.getBytes()));
System.out.println(b + "|" + Arrays.toString(b.getBytes()));
Which will output:
a b|[97, -62, -96, 98]
a b|[97, 32, 98]

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Multiple line strings in Apache Zeppelin - apache-zeppelin

In Scala (using Apache Zeppelin as well as otherwise), you can write expressions covering multiple lines by wrapping them in parentheses: val text = ("line 1" + "line 2")

Related

Best methods to extract substring into arguments and allow arguments with multiple "main split character" indicated by a character in Lua?

Improve code with checking element in array is digit

Getting "#ERROR" in certain fields when using Split function in SSRS

Can't display unicode characters from file properly

even two string are same but when compare result are coming false

Categories

Resources