Iterate through string - arrays

I am having difficulties in Ruby for an assignment(we just started to see it). I think I am close to the answer but something is not right with how I stor emy string elements.
So the assignment is to have a method HID_Num(String) that takes a string.
To be a valid string it needs to have: 4 uppercase letters, followed by 8 digits number (e.g.: "ABCD18347692").The method should return the individual number as an integer value, or nil if no individual valid card number is found.
Ex: HID_num ("ADBC12345678 abcD12345678") should return 12345678
my code so far:
def HID_num(str)
str.gsub(/\s+/m, ' ').strip.split(" ")
str.each { |element|
if(element.count == 12)
count = str.count("A-Z")
if(count == 4)
count2 = str.count("1-9")
if(count == 8)
puts str[3...11]
else{
puts "nil"
} }
end end
What'S wrong with it: does not store the element of str.gsub.
I was thinking to used pointers but not sure how to used them OR to store each element in an array but I have no idea how... Maybe I am thinking too much like Java coding?
Thanks!

This is pretty easy to do with regular expressions. Here's an example:
def hid_num(str)
exp = /\A[a-z]{4}(\d{8})\z/i
match = str.match(exp)
match && match[1]&.to_i
end
In this case, the expression exp matches strings beginning with 4 alphabet characters, followed by 8 digits, to the end of the string. Next, we grab a match on the exp against the provided str. If there is not a match, return nil; otherwise, return the digit portion of the string, cast to an integer value.
Hope that helps!

You could use:
def HID_num(string)
entries = string.split(" ")
entries.map do |entry|
matches = entry.match(/\A[A-Z]{4}(\d{8})\z/)
next unless matches
matches[1]
end.compact
end
It's a fairly simple regex matcher - this will return an array of all the valid strings' numbers. If you want to tweak it to return just the first valid number, you can change map for each and add break ahead of the line matches[1], and finally remove the compact.
Hope that helps!

Related

Efficient way to filter an array based on element index

I have an array of arrays. So each element of my first array contains a comma separated list of values. If I use the split function, I can get an array from this comma separated list. What I need to do is filter out this second array based on element position. For example only keep columns one, three, five and nine.
One way to do this is loop thru my first array, for each element do a split on the element to get my second array. Then loop thru this second array, increment a counter to track the current element index. If the counter is equal to one of the columns I want to keep, then concat the element to a string variable.
This is very inefficient and takes forever to run on large arrays. Does anyone have any ideas on a better way to do this? I hope I explained this clearly.
There are some built in array actions like “Filter” and “Join” as you mention, but for something this specific I imagine you’ll need to call some code (e.g. azure function) to quickly do manipulation and return result
For the first loop I don't know of any alternate but for second loop, Instead of looping through the second array,you can simply access the elements with index that you require.
Assuming the size is not an issue.
string[] Arr1 = new string[] { "0_zero,0_One,0_Two,0_Three,0_Four,0_Five,0_six,0_seven,0_eight,0_nine",
"1_zero,1_One,1_Two,1_Three,1_Four,1_Five,1_six,1_seven,1_eight,1_nine" };
string myString = string.Empty;
foreach(var a in Arr1)
{
var sp = a.Split(',');
myString= string.Concat(myString, sp[0], sp[3], sp[5], sp[9]);
}
Console.WriteLine(myString); //gives "0_One0_Three0_Five0_nine1_One1_Three1_Five1_nine"
In case we're not sure of length of each string, we can use if else ladder with decreasing order from maximum index that we want to use like so
foreach(var a in Arr1)
{
var sp = a.Split(',');
int len = sp.Length;
if (len >= 10) myString= string.Concat(myString, sp[1], sp[3], sp[5], sp[9]);
else if (len >= 6) myString = string.Concat(myString, sp[1], sp[3], sp[5]);
else if (len >= 4) myString = string.Concat(myString, sp[1], sp[3]);
else if (len >= 2) myString = string.Concat(myString, sp[1]);
}
So that we don't face IndexOutofBoundsException

Taking a string, reversing the letter in each word while leaving the word in its original position

I am trying to take a sentence, and reverse the positions of the letters in each word.
Below is my code that does not work:
def test(sentence)
array = []
array << sentence.split
array.collect {|word| word.reverse}
end
My problem is with:
array << sentence.split
It says it divides each word, but when I use interpolation, it reverses the whole sentence. Below is a similar code that works:
def test2
dog = ["Scout", "kipper"]
dog.collect {|name| name.reverse}
end
But it does not accept a sentence, and it already has the array defined.
I'm thinking you want to split, then map each element of the array to its reversed version then rejoin the array into a string:
def test(sentence)
sentence.split.map {|word| word.reverse}.join(" ")
end
More concise using symbol-to-proc (credit #MarkThomas in comments)
sentence.split.map(&:reverse).join " "
Unlike methods that break up the sentence into words, reverses each word and then rejoins them, the use of Array#gsub with a regular expression preserves non-word characters (such as the comma in the example below) and multiple spaces.
"vieille, mère Hubbard".gsub(/\b\p{L}+\b/, &:reverse)
#=> "ellieiv, erèm drabbuH"
def reverse(str)
str.split.map { |s| s.length < 5 ? s : s.reverse }.join(' ')
end
puts reverse ("this is a catalogy")
This should reverse each word thats length upto 5

No var or breakable: How to "break" when a predicate is met, in an array traversal?

How would I write this programming logic into a functional method signature? I am attempting to loop/traverse an array until a condition is met, then break upon that condition. I'm mostly trying my best to avoid var and breakable from scala.util.control.Breaks. It makes use of a closure, in this case, dictionary, to check if a condition/predicate is met. The idea is that I am looping through an array until the predicate is met. I'm also avoiding converting my array to list. Would use of an array not allow me to splice the array, for example, to do pattern matching?
val dictionary = Array.fill(128)(false)
def isUnique(array: Array[Char]): Option[Char] = {
// traverse each element of the array {
// if a character.toInt is in the dictionary, insert into dictionary
// exit loop, with the character which broke the loop
// else
// set dictionary(character.toInt) to true and continue looping
// }
}
Here's an example use case:
val word = "abcdefggghijklmnopqrstuvqxyz".toArray
val charThatBrokeIt = isUnique(word)
Edit: Feel free to suggest or propose other return types as well, such as a Boolean, Tuple, Case Class, or any others. Option[Char] might not be a good resultant value on my part. For example. I may have returned false in the case that loop broke out early (short-circuited) or not.
First, a String already acts like a collection, so you should just use String instead of Array[Char]. Second, you can take advantage of laziness to allow short-circuiting while still splitting the algorithm into parts, using .view.
def breaksUnique(word: String): Option[Char] = {
val cumulativeSets = word.view.scanLeft(Set.empty[Char]){_ + _}
val zipped = cumulativeSets zip word
val nonDupsDropped = zipped dropWhile {case (set, char) => !(set contains char)}
nonDupsDropped.map{_._2}.headOption
}
The first two lines are written as if they process the entire word, but because they operate on a view, they are only calculated as needed.
cumulativeSets is a sequence of sets of every character that has been seen up to that point. If you ran it on "abb", you would get Set(), Set(a), Set(a,b), Set(a,b). That is combined with the original word using zip, giving (Set(),a), (Set(a),b), (Set(a,b),b). We then just have to drop all the pairs where the character doesn't appear in the set, then return the first element that wasn't dropped.
Early breakout always suggests recursion.
def isUnique(array: Array[Char]): Option[Char] = {
def getDup(index: Int, acc: Set[Char]): Option[Char] =
if (array.isDefinedAt(index))
if (acc(array(index))) Some(array(index))
else getDup(index+1, acc + array(index))
else None
getDup(0, Set.empty[Char])
}
Usage:
val word = "abcdefggghijklmnopqrstuvqxyz".toArray
val charThatBrokeIt = isUnique(word)
//charThatBrokeIt: Option[Char] = Some(g)

storing the longest string after strsplit

I am trying to store the longest resultant string after using the function strsplit unable to do so
eg: I have input strings such as
'R.DQDEGNFRRFPTNAVSMSADENSPFDLSNEDGAVYQRD.L'or
'L.TSNKDEEQRELLKAISNLLD'
I need store the string only between the dots (.)
If there is no dot then I want the entire string.
Each string may have zero, one or two dots.
part of the code which I am using:
for i=1:700
x=regexprep(txt(i,1), '\([^\(\)]*\)','');
y=(strsplit(char(x),'.'));
for j=1:3
yValues(1,j)=y{1,j};
end
end
But the string yValues is not storing the value of y, instead showing the following error:
Assignment has more non-singleton rhs dimensions than non-singleton subscripts
What am I doing wrong and are there any suggestions on how to fix it?
The issue is that y is a cell array and each element contains an entire string and it therefore can't be assigned to a single element in a normal array yvalues(1,j).
You need yvalues to be a cell array and then you can assign into it just fine.
yValues{j} = y{j};
Or more simply
% Outside of your loop
yValues = cell(1,3);
% Then inside of your loop
yValues(j) = y(j);
Alternately, if you just want the longest output of strsplit, you can just do something like this.
% Split the string
parts = strsplit(mystring, '.');
% Find the length of each piece and figure out which piece was the longest
[~, ind] = max(cellfun(#numel, parts));
% Grab just the longest part
longest = parts{ind};

Find longest suffix of string in given array

Given a string and array of strings find the longest suffix of string in array.
for example
string = google.com.tr
array = tr, nic.tr, gov.nic.tr, org.tr, com.tr
returns com.tr
I have tried to use binary search with specific comparator, but failed.
C-code would be welcome.
Edit:
I should have said that im looking for a solution where i can do as much work as i can in preparation step (when i only have a array of suffixes, and i can sort it in every way possible, build any data-structure around it etc..), and than for given string find its suffix in this array as fast as possible. Also i know that i can build a trie out of this array, and probably this will give me best performance possible, BUT im very lazy and keeping a trie in raw C in huge peace of tangled enterprise code is no fun at all. So some binsearch-like approach will be very welcome.
Assuming constant time addressing of characters within strings this problem is isomorphic to finding the largest prefix.
Let i = 0.
Let S = null
Let c = prefix[i]
Remove strings a from A if a[i] != c and if A. Replace S with a if a.Length == i + 1.
Increment i.
Go to step 3.
Is that what you're looking for?
Example:
prefix = rt.moc.elgoog
array = rt.moc, rt.org, rt.cin.vof, rt.cin, rt
Pass 0: prefix[0] is 'r' and array[j][0] == 'r' for all j so nothing is removed from the array. i + 1 -> 0 + 1 -> 1 is our target length, but none of the strings have a length of 1, so S remains null.
Pass 1: prefix[1] is 't' and array[j][1] == 'r' for all j so nothing is removed from the array. However there is a string that has length 2, so S becomes rt.
Pass 2: prefix[2] is '.' and array[j][2] == '.' for the remaining strings so nothing changes.
Pass 3: prefix[3] is 'm' and array[j][3] != 'm' for rt.org, rt.cin.vof, and rt.cin so those strings are removed.
etc.
Another naïve, pseudo-answer.
Set boolean "found" to false. While "found" is false, iterate over the array comparing the source string to the strings in the array. If there's a match, set "found" to true and break. If there's no match, use something like strchr() to get to the segment of the string following the first period. Iterate over the array again. Continue until there's a match, or until the last segment of the source string has been compared to all the strings in the array and failed to match.
Not very efficient....
Naive, pseudo-answer:
Sort array of suffixes by length (yes, there may be strings of same length, which is a problem with the question you are asking I think)
Iterate over array and see if suffix is in given string
If it is, exit the loop because you are done! If not, continue.
Alternatively, you could skip the sorting and just iterate, assigning the biggestString if the currentString is bigger than the biggestString that has matched.
Edit 0:
Maybe you could improve this by looking at your array before hand and considering "minimal" elements that need to be checked.
For instance, if .com appears in 20 members you could just check .com against the given string to potentially eliminate 20 candidates.
Edit 1:
On second thought, in order to compare elements in the array you will need to use a string comparison. My feeling is that any gain you get out of an attempt at optimizing the list of strings for comparison might be negated by the expense of comparing them before doing so, if that makes sense. Would appreciate if a CS type could correct me here...
If your array of strings is something along the following:
char string[STRINGS][MAX_STRING_LENGTH];
string[0]="google.com.tr";
string[1]="nic.tr";
etc, then you can simply do this:
int x, max = 0;
for (x = 0; x < STRINGS; x++) {
if (strlen(string[x]) > max) {
max = strlen(string[x]);
}
}
x = 0;
while(true) {
if (string[max][x] == ".") {
GOTO out;
}
x++;
}
out:
char output[MAX_STRING_LENGTH];
int y = 0;
while (string[max][x] != NULL) {
output[y++] = string[++x];
}
(The above code may not actually work (errors, etc.), but you should get the general idea.
Why don't you use suffix arrays ? It works when you have large number of suffixes.
Complexity, O(n(logn)^2), there are O(nlogn) versions too.
Implementation in c here. You can also try googling suffix arrays.

Resources