Basic Pig Latin Translator returning a NoMethodError - arrays

What I've got here gives me the solution I'm looking for but I get a NoMethodError for the slice method when I run it.
def pig_it(text)
b4pig = text.split(" ")
l1 = b4pig[0].slice(0,1)
l2 = b4pig[1].slice(0,1)
l3 = b4pig[2].slice(0,1)
l4 = b4pig[3].slice(0,1)
done = b4pig[0].delete(l1)+l1+"ay " + b4pig[1].delete(l2)+l2+"ay " + b4pig[2].delete(l3)+l3+"ay " + b4pig[3].delete(l4)+l4+"ay"
return done
end
All the program needs to do is convert the first phrase to the second phrase
('Pig latin is cool'),'igPay atinlay siay oolcay')
('This is my string'),'hisTay siay ymay tringsay')

I suggested one possible reason for the exception in a comment on the question, but you have not given enough information for readers to provide a definitive explanation.
I would like to suggest an alternative construction of your method that has three advantages:
the argument, a string, may contain any number of words;
punctuation is permitted; and
whitespace is preserved (e.g, tabs, line terminators, extra spaces)
The method is as follows.
def pig_it(text)
text.gsub(/[a-z]+/i) { |word| format("%s%sya", word[1..], word[0] }
end
Let's try it.
pig_it "Baa, baa black sheep\nHave you any wool?"
#=> "aaBya, aabya lackbya heepsya\naveHya ouyya nyaya oolwya?"
See String#gsub and Kernel#format.
/[a-z]+/i is a regular expression. [a-z] is a character class. It matches exactly one character in the character class. The character class contains all lower case letters of the alphabet. [a-z]+ matches one or more lower-case letters. The i following / means the expression is case insensitive, meaning the expression matches one or more letters, each letter being lower or upper case. It follows that whitespace and punctuation are not matched.
The block ({ |word| .... }) contains the block variable word which holds each match of the regular expression. In the example, word will in turn hold "Baa", "baa", "black" and so on.
Suppose word holds "black", then
word[0] #=> "b"
word[1..] #=> "lack"
format("%s%sya", word[1..], word[0])
#=> format("%s%sya", "lack", "b")
#=> "lackbya"
I kept your method name because I do not think it can be improved upon.

Related

How do I select 3 character in a string after a specific symbol

Ok, so let me explain. I have some string like this : "BAHDGF - ZZZGH1237484" or
like this "HDG54 - ZZZ1HDGET4" I want here to select the triple Z (that are obviously 3 differents character but for the example I think it's more comprehensible for you).
So, my problem is the next : The first part has a modulable length but i can just ignore it, I need something to take the triple Z so I was thinking about something that can "slice" my string after
the " - ".
I started to try using "partition" but I just failed lamentably. I just get kinda lost with the news 3 array and then take the first 3 letter of one of the array, well, it seems very complicated and i think I'm just passing by an obvious solution that I can't find actually. It's been something like 2 days that i'm on it without anything in my mind that can help me, sort of blank page syndrome actually and I really need a little help to unlock this point.
Given:
examples=[ "BAHDGF - ZZZGH1237484", "HDG54 - ZZZ1HDGET4" ]
You could use a regex:
examples.each {|e| p e, e[/(?<=-\s)ZZZ/]}
Prints:
"BAHDGF - ZZZGH1237484"
"ZZZ"
"HDG54 - ZZZ1HDGET4"
"ZZZ"
Or .split with a regex:
examples.each {|e| p e.split(/-\s*(ZZZ)/)[1] }
'ZZZ'
'ZZZ'
If the 3 characters are something other than 'ZZZ' just modify your regex:
> "BAHDGF - ABCGH1237484".split(/\s*-\s*([A-Z]{3})/)[1]
=> "ABC"
If you wanted to use .partition it is two steps. Easiest with a regex partition and then just take the first three characters:
> "BAHDGF - ABCGH1237484".partition(/\s*-\s*/)[2][0..2]
=> "ABC"
"Selecting" the string "ZZZ" is a misnomer. What you have asked for is to determine if the string contains the substring "- ZZZ" and if it does, return "ZZZ":
"BAHDGF - ZZZGH1237484".include?("- ZZZ") && "ZZZ"
#=> "ZZZ"
"BAHDGF - ZZVGH1237484".include?("- ZZZ") && "ZZZ"
#=> false
That is very little different that just asking if the string contains the substring "- ZZZ":
if "BAHDGF - ZZZGH1237484".include?("- ZZZ")
...
end
If the question were instead, say, return a string of three identical capital letters following "- ", if present, you would be selecting a substring. That could be done as follows.
r = /- \K(\p{Lu})\1{2}/
"BAHDGF - XXXGH1237484"[r]
#=> "XXX"
"BAHDGF - xxxGH1237484"[r]
#=> nil
The regular expression reads, "match '- ', then forget everything matched so far and reset the match pointer to the current location (\K), then match an upper case Unicode letter (\p{Lu}) and save it to capture group 1 ((\p{Lu})), then match the contents of capture group 1 (\1) twice ({2})". One may alternatively use a positive lookbehind:
/(?<=- )(\p{Lu})\1{2}/

How to I specify that a match should occur if a character does not belong to an array?

I'm having trouble specifying "the next character should not be from this group of characters" in my regex. I have
TOKENS = [":", ".", "'"]
"01:39\t" =~ /\b0\d[#{Regexp.union(TOKENS)}]\d\d^#{Regexp.union(TOKENS)}/
#=> nil
Since "\t" is not part of my TOKENS array, I would think the above should match, but it does not. How do I adjust my regex, specifically this part
^#{Regexp.union(TOKENS)}
to say that the character should not be part of this array?
You need brackets around the "not" portion of the regex.
>> TOKENS = [":", ".", "'"]
>> regex = /\b0\d[#{Regexp.union(TOKENS)}]\d\d^#{Regexp.union(TOKENS)}/
>> "01:39\t" =~ regex
#=> nil
However:
>> regex = /\b0\d[#{Regexp.union(TOKENS)}]\d\d[^#{Regexp.union(TOKENS)}]/
# Add brackets here^ and here^
>> "01:39\t" =~ regex
#=> 0
Your /\b0\d[#{Regexp.union(TOKENS)}]\d\d^#{Regexp.union(TOKENS)}/ pattern will finally look like
/(?-mix:\b0\d[(?-mix::|\.|')]\d\d^(?-mix::|\.|'))/
^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
Here, the regex object is a modifier group with disabled multiline, case insensitive and free spacing modes. The last ^ is the start of the line anchor, and it alone ruins the whole regex turning it into a pattern that never matches any string.
It is not enough to wrap the #{Regexp.union(TOKENS)} with [...] character class brackets, you would need to use the .source property to get rid of (?-mix:...) since you do not want to negate m, i, x, etc. However, you just can't use Regexp.union since it will add | char and inside a character class, it is treated as a literal char (so, you will also negate pipes).
You should define the separator sequence with TOKENS.join().gsub(/[\]\[\^\\-]/, '\\\\\\&') to escape all chars that should be escaped inside a regex character class and then place in between character class square brackets.
Ruby demo:
TOKENS = [":", ".", "'", "]"]
sep_rx = TOKENS.join().gsub(/[\]\[\^\\-]/, '\\\\\\&')
puts sep_rx
# => :.'\]
rx = /\b0\d[#{sep_rx}]\d\d[^#{sep_rx}]/
puts rx.source
# => \b0\d[:.'\]]\d\d[^:.'\]]
puts "01:39\t" =~ rx
# => 0
See the Rubular demo
Note that .gsub(/[\]\[\^\\-]/, '\\\\\\&') matches ], [, ^, \ and - and adds a backslash in front of them. The first 4 backslashes in '\\\\\\&' define a literal backslash in the replacement pattern and \\& stands for the whole match

Using variables with %w or %W - Ruby

I have a similar issue as this post:
How to use variable inside %w{}
but my issue is a bit different. I want to take a string variable and convert it to an array using %w or %W.
text = gets.chomp # get user text string
#e.g I enter "first in first out"
words = %w[#{text}] # convert text into array of strings
puts words.length
puts words
Console output
1
first in first out
Keeps the text as a block of string and doesn't split it into an array words ["first","in", "first", "out"]
words = text.split (" ") # This works fine
words = %w[#{gets.chomp}] # This doesn't work either
words = %w['#{gets.chomp}'] # This doesn't work either
words = %W["#{gets.chomp}"] # This doesn't work either
words = %w("#{gets.chomp}") # This doesn't work either
%w is not intended to do any splitting, it's a way of expressing that the following string in the source should be split. In essence it's just a short-hand notation.
In the case of %W the #{...} chunks are treated as a single token, any spaces contained within are considered an integral part.
The correct thing to do is this:
words = text.trim.split(/\s+/)
Doing things like %W[#{...}] is just as pointless as "#{...}". If you need something cast as a string, call .to_s. If you need something split call split.

Perl: custom sort order?

A stenographic keyboard has the keys in a specific order: STKPWHRAO*#EUFRPBLGTS.
I am attempting to take an input $word and determine if its letters follow this order, from left to right.
So KAT would be valid, but FRAG would not be, because while F is before R on the right side, they are not before A-. TKPWAUL would work, but GAUL would not, because -G is not before A. The keys must be ordered from left to right.
I'm getting tripped up by some letters appearing twice in the order.
Thank you very much for any ieas!
You could create a regex with anchors to start and end of string and allow every character 0 or one time. Here's an example:
sub match {
my $yesno = $_[0] =~ /^S?T?K?P?W?H?R?A?O?\*?#?E?U?F?R?P?B?L?G?T?S?\.?$/g;
print $_[0] . " " . ($yesno ? 'yes' : 'no') . "\n";
}
match 'KAT';
match 'FRAG';
match 'TKPWAUL';
match 'GAUL';
delivers
KAT yes
FRAG no
TKPWAUL yes
GAUL no
You could generate that regex from a list using split, join etc.
Here is a straightforward algorithm. This should be efficient, and can also be improved if needed.
Iterate through the characters in the word, searching for each in the reference sequence. Compare match's position in the sequence with the one for the previous character. Keep searching for all matches since some letters repeat in the sequence. Search uses index.
sub accept_word {
my ($refseq, $word) = #_;
my ($mark, $pos) = (0, 0);
foreach my $ch (split '', $word) {
# search until position is >= $mark, or the word is bad
while ( ($pos = index($refseq, $ch, $pos)) != -1 ) {
$mark = $pos, last if $pos >= $mark;
}
return 0 if $pos < $mark;
}
return 1;
}
for my $word (qw(KAT FRAG TKPWAUL GAUL SAS)) {
print "$word is " . (accept_word($refseq, $word) ? 'accepted' : 'rejected') . "\n";
}
Comments:
This can be tightened quite a bit if needed. The search can be greatly optimized since only 'S' and 'T' repeat, at start and end (see comments). Or, it can be optimized by looking up the count of letters in the sequence first (say via ('S' => 2, 'T' => 2, 'K' => 1) etc) so that index doesn't do unnecessary work. See tba's comment for his link to a slightly tighter version and a benchmark between that and his posted regex solution, which uses a different algorithm.
A detailed worded description of this step-by-step solution. Iterate through your word by characters, for each doing the following:
Traverse the reference sequence and once a match is found record its numeric index in the sequence. On the first pass (first word character) this becomes the highest position, say $mark.
For the rest of iterations one need be careful since the reference sequence has repeated characters. (Thanks to tba for a comment.) As a char is found to match in the sequence, the index of the match is compared to $mark and if it is >= we reset the $mark to it and go to the next char. If the position is < $mark the search-and-comparison continues until >= is found or the sequence exhausted, when the word is discarded (char is to the left of the previous one). Improvement: start the search from $mark and if a match is found reset $mark and move to next char, otherwise the word is discarded (done via index in the code above). As you are matching characters in your word you are crawling up the reference sequence and remembering how far you got.
This way the word is mapped to a non-decreasing numeric sequence based in the reference string, or discarded. In the code above that numeric encoding can be recorded if needed.

Taking user input and confirming format

I am new to Ruby: I have only completed the codecademy for it and I have very limited experience with Rails.
I am trying to create a simple function that will take a user's input and see if it meets the right criteria for a serial number: three capital letters, a dash, and then seven numbers.
Here is what I have so far:
"enter serial"
serialNumber = gets.chomp
serNumarr = serialNumber.split("")
caps = serNumarr[0..2]
dash = serNumarr[3]
nums = serNumarr[4..10]
if dash != "-"
puts "not a serial Number"
end
Now I have also asked on other forums and I was told to utilize this code:
def letter?(lookAhead)
lookAhead =~ /[[:alpha:]]/
end
But I have zero experience with regular expressions. How can I use the above code to solve my problem? Thanks.
Here's a contrived solution:
puts "enter three capital letters, a dash, and then seven numbers:"
input = gets.chomp #=> note that input is a String
if input =~ /^[A-Z]{3}-\d{7}$/
puts "valid"
else
puts "invalid"
end
Breaking down the regular expression into human-readable language:
^ means start of line (unless it is in a character class where it's negation)
[A-Z]{3} means 3 of any uppercase letter; [] represents a character class (i.e. uppercase letters) and the associated{3} means exactly three of the character class
- is the dash character
\d{7} means exactly 7 digits
$ means end of line
If you don't have experience with regular expressions, it'll be worth your while to find a tutorial and invest the time to learn the basics. And http://rubular.com/ is an online regular expression editor that I can't endorse stronly enough.

Resources