So i have managed to download the content of this webpage - http://www.puzzlers.org/pub/wordlists/unixdict.txt
and i am trying to sort each word into an array like so -
let url = NSURL(string: "http://www.puzzlers.org/pub/wordlists/unixdict.txt")
let task = NSURLSession.sharedSession().dataTaskWithURL(url!) {
(data, respons, error) in
if error == nil {
var urlContent = NSString(data: data, encoding: NSUTF8StringEncoding) as String
var wordArr = urlContent.componentsSeparatedByString(" ")
println(wordArr)
}
}
task.resume()
}
But the problem is i am not getting each word into an array because the words are not separated by a "Space". All i'm getting is the content but as 1 item in the array.
How would i go about separating the content into an array of words. Separating the words by line breaks?
Thank you in advance :)
The preferred way is enumerateLinesUsingBlock with NSString
var urlContent = NSString(data: data, encoding: NSUTF8StringEncoding)!
var lines:[String] = []
urlContent.enumerateLinesUsingBlock { line, _ in
lines.append(line)
}
println(lines)
In this specific case, you can simply split by "\n":
var urlContent = NSString(data: data, encoding: NSUTF8StringEncoding) as String
var lines = urlContent.componentsSeparatedByString("\n")
if lines.last == "" {
// The last one is empty string
lines.removeLast()
}
But in general, the line separator may not be "\n". see this document:
A line is delimited by any of these characters, the longest possible
sequence being preferred to any shorter:
U+000D (\r or CR)
U+2028 (Unicode line separator)
U+000A (\n or LF)
U+2029 (Unicode paragraph separator)
\r\n, in that order (also known as CRLF)
enumerateLinesUsingBlock can handle all of them.
Related
I have text file consisting repeating strings elements like this
#SRR1582908.1 ILLUMINA_0154:8:1101:3556:1998/1 //1 line
CCTCGCTGGTCGATTTGTTTAACCGTTTTCTGTTCAGCGCCAAAATTATTTT //2 line
+ //3 line
BCCFFFFFHHHHHIIIIIHIHGHHGGIIIIIGHHDBGHHHIIGHIIGIHIII //4 line
and 29 millions of such elements in the file. What I need is to extract second line from each element into new array of strings
I have a function which is doing following:
do
{
let textToOpen = try String(contentsOf: chosenFile!, encoding: .utf8)
var arrayOfTextLines = textToOpen.components(separatedBy: "\n")
arrayOfReads = stride(from: 1, to: arrayOfTextLines.endIndex, by: 4).map{(arrayOfTextLines[$0])}
}
catch
{}
The code is working correctly but for the file with 29 millions elements is relatively slow.
The main drawback is
var arrayOfTextLines = textToOpen.components(separatedBy: "\n")
Is there a way to speed-up the function ?
I've read a text file into a big string:
fileText = try NSString(contentsOfFile: pathToFile, encoding: String.Encoding.utf8.rawValue) as String
(I omitted the do/catch part and fileText was declared as an optional string constant before the assignment).
Now I split the lines out into an array of strings, trim the whitespace from each, and then remove any empty strings:
let lines = (fileText!.components(separatedBy: "\n")).map { $0.trimmingCharacters(in: .whitespaces)}.filter {$0.count > 0}
And it works fine, but I'm learning Swift 4 and I suspect there's a cleaner way to accomplish my task, right? I would appreciate any examples that put my code to shame. Thanks!
Problem of your code, is that you iterate through all String charecters for four times. But task can be complete only iterating once. Something like this:
let myString = String() // String received from any source
var lines = [String]()
var line = ""
myString!.forEach {
switch $0 {
case " ":
break
case "\n":
if line.count > 0 {
lines.append(line)
line = ""
}
default:
line += String($0)
}
}
print(lines)
For test file:
$ cat test.txt
123123123 12312312
123 12312312 12312312
sfsdfsdfsfsdf
sdfsdf 23234 sdfsdfs 23234
sdfsdf
sdfsdfsdf
Result will be:
$ swift main.swift
["12312312312312312", "1231231231212312312", "sfsdfsdfsfsdf", "sdfsdf23234sdfsdfs23234", "sdfsdf", "sdfsdfsdf"]
From the documentation, it's not clear. In Java you could use the split method like so:
"some string 123 ffd".split("123");
Use split()
let mut split = "some string 123 ffd".split("123");
This gives an iterator, which you can loop over, or collect() into a vector.
for s in split {
println!("{}", s)
}
let vec = split.collect::<Vec<&str>>();
// OR
let vec: Vec<&str> = split.collect();
There are three simple ways:
By separator:
s.split("separator") | s.split('/') | s.split(char::is_numeric)
By whitespace:
s.split_whitespace()
By newlines:
s.lines()
By regex: (using regex crate)
Regex::new(r"\s").unwrap().split("one two three")
The result of each kind is an iterator:
let text = "foo\r\nbar\n\nbaz\n";
let mut lines = text.lines();
assert_eq!(Some("foo"), lines.next());
assert_eq!(Some("bar"), lines.next());
assert_eq!(Some(""), lines.next());
assert_eq!(Some("baz"), lines.next());
assert_eq!(None, lines.next());
There is a special method split for struct String:
fn split<'a, P>(&'a self, pat: P) -> Split<'a, P> where P: Pattern<'a>
Split by char:
let v: Vec<&str> = "Mary had a little lamb".split(' ').collect();
assert_eq!(v, ["Mary", "had", "a", "little", "lamb"]);
Split by string:
let v: Vec<&str> = "lion::tiger::leopard".split("::").collect();
assert_eq!(v, ["lion", "tiger", "leopard"]);
Split by closure:
let v: Vec<&str> = "abc1def2ghi".split(|c: char| c.is_numeric()).collect();
assert_eq!(v, ["abc", "def", "ghi"]);
split returns an Iterator, which you can convert into a Vec using collect: split_line.collect::<Vec<_>>(). Going through an iterator instead of returning a Vec directly has several advantages:
split is lazy. This means that it won't really split the line until you need it. That way it won't waste time splitting the whole string if you only need the first few values: split_line.take(2).collect::<Vec<_>>(), or even if you need only the first value that can be converted to an integer: split_line.filter_map(|x| x.parse::<i32>().ok()).next(). This last example won't waste time attempting to process the "23.0" but will stop processing immediately once it finds the "1".
split makes no assumption on the way you want to store the result. You can use a Vec, but you can also use anything that implements FromIterator<&str>, for example a LinkedList or a VecDeque, or any custom type that implements FromIterator<&str>.
There's also split_whitespace()
fn main() {
let words: Vec<&str> = " foo bar\t\nbaz ".split_whitespace().collect();
println!("{:?}", words);
// ["foo", "bar", "baz"]
}
The OP's question was how to split with a multi-character string and here is a way to get the results of part1 and part2 as Strings instead in a vector.
Here splitted with the non-ASCII character string "ββπ€" in place of "123":
let s = "ββπ€"; // also works with non-ASCII characters
let mut part1 = "some string ββπ€ ffd".to_string();
let _t;
let part2;
if let Some(idx) = part1.find(s) {
part2 = part1.split_off(idx + s.len());
_t = part1.split_off(idx);
}
else {
part2 = "".to_string();
}
gets: part1 = "some string "
Β Β Β Β Β part2 = " ffd"
If "ββπ€" not is found part1 contains the untouched original String and part2 is empty.
Here is a nice example in Rosetta Code -
Split a character string based on change of character - of how you can turn a short solution using split_off:
fn main() {
let mut part1 = "gHHH5YY++///\\".to_string();
if let Some(mut last) = part1.chars().next() {
let mut pos = 0;
while let Some(c) = part1.chars().find(|&c| {if c != last {true} else {pos += c.len_utf8(); false}}) {
let part2 = part1.split_off(pos);
print!("{}, ", part1);
part1 = part2;
last = c;
pos = 0;
}
}
println!("{}", part1);
}
into that
Task
Split a (character) string into comma (plus a blank) delimited strings based on a change of character (left to right).
If you are looking for the Python-flavoured split where you tuple-unpack the two ends of the split string, you can do
if let Some((a, b)) = line.split_once(' ') {
// ...
}
I am having trouble trying to figure this topic out. Like the topic, How do I delete an element that contains a letter in Array. This is the code I have so far.
let newline = "\n"
let task = Process()
task.launchPath = "/bin/sh"
task.arguments = ["-c", "traceroute -nm 18 -q 1 8.8.8.8"]
let pipe = Pipe()
task.standardOutput = pipe
task.launch()
let data = pipe.fileHandleForReading.readDataToEndOfFile()
let output = NSString(data: data, encoding: String.Encoding.utf8.rawValue) as! String
var array = output.components(separatedBy: " ")
array = array.filter(){$0 != "m"}
print(array, newline)
I have tried multiple options given by this stack overflow.
How to remove an element from an array in Swift
I think I have hit a wall.
Have you tried
array = array.filter({ !$0.contains("m") })
I read into myArray (native Swift) from a file containing a few thousand lines of plain text..
myData = String.stringWithContentsOfFile(myPath, encoding: NSUTF8StringEncoding, error: nil)
var myArray = myData.componentsSeparatedByString("\n")
I change some of the text in myArray (no point pasting any of this code).
Now I want to write the updated contents of myArray to a new file.
I've tried this ..
let myArray2 = myArray as NSArray
myArray2.writeToFile(myPath, atomically: false)
but the file content is then in the plist format.
Is there any way to write an array of text strings to a file (or loop through an array and append each array item to a file) in Swift (or bridged Swift)?
As drewag points out in the accepted post, you can build a string from the array and then use the writeToFile method on the string.
However, you can simply use Swift's Array.joinWithSeparator to accomplish the same with less code and likely better performance.
For example:
// swift 2.0
let array = [ "hello", "goodbye" ]
let joined = array.joinWithSeparator("\n")
do {
try joined.writeToFile(saveToPath, atomically: true, encoding: NSUTF8StringEncoding)
} catch {
// handle error
}
// swift 1.x
let array = [ "hello", "goodbye" ]
let joined = "\n".join(array)
joined.writeToFile(...)
With Swift 5 and I guess with Swift 4 you can use code snippet which works fine to me.
let array = ["hello", "world"]
let joinedStrings = array.joined(separator: "\n")
do {
try joinedStrings.write(toFile: outputURL.path, atomically: true, encoding: .utf8)
} catch let error {
// handle error
print("Error on writing strings to file: \(error)")
}
You need to reduce your array back down to a string:
var output = reduce(array, "") { (existing, toAppend) in
if existing.isEmpty {
return toAppend
}
else {
return "\(existing)\n\(toAppend)"
}
}
output.writeToFile(...)
The reduce method takes a collection and merges it all into a single instance. It takes an initial instance and closure to merge all elements of the collection into that original instance.
My example takes an empty string as its initial instance. The closure then checks if the existing output is empty. If it is, it only has to return the text to append, otherwise, it uses String Interpolation to return the existing output and the new element with a newline in between.
Using various syntactic sugar features from Swift, the whole reduction can be reduced to:
var output = reduce(array, "") { $0.isEmpty ? $1 : "\($0)\n\($1)" }
Swift offers numerous ways to loop through an array. You can loop through the strings and print to a text file one by one. Something like so:
for theString in myArray {
theString.writeToFile(myPath, atomically: false, encoding: NSUTF8StringEncoding, error: nil);
}