How to remove duplicate in an array in Rust? - arrays

I have generated an array of numbers. I would like to remove the duplicates. In javascript I can just use [...new Set(arr)] and get the job done.
In Rust, I havn't found a simple way to achieve that so far.
I've written:
use rand::{thread_rng, Rng};
use itertools::Itertools;
fn main() {
let mut arr:Vec<u8> = Vec::new();
for _ in 0..10 {
arr.push(thread_rng().gen_range(0..10))
}
println!("random {:?}", arr);
arr.iter().unique();
println!("unique {:?}", arr);
}
The output are:
random [7, 0, 3, 6, 7, 7, 1, 1, 8, 6]
unique [7, 0, 3, 6, 7, 7, 1, 1, 8, 6]
So I've tried to get the "no duplicate" result in another variable:
let res = &arr.iter().unique();
The result was:
Unique { iter: UniqueBy { iter: Iter([1, 2, 0, 0, 7, 0, 2, 2, 1, 6]), used: {} } }
Also, it seems I can't sort the array before performing the removal of duplicate. This code returns an error: no method named 'iter' found for unit type '()' in the current scope method not found in '()'.
arr.sort().iter().unique();
Also, maybe there is a way to achieve the sort+unique value output without external crates?

Using the standard library
Usually, sorting an array is an ok way of deduplicating it, but, except if you are using a radix sort (which is not the sorting method Rust uses), it's asymptotically better to what you would do in JS. Here is the Rust equivalent:
let a_vector = vec![7, 0, 3, 6, 7, 7, 1, 1, 8, 6];
let uniqued_vector = a_vector
.into_iter()
.collect::<HashSet<_>>()
.into_iter()
.collect::<Vec<_>>();
This will turn your array into an iterator, then that iterator into a HashSet (which will deduplicate it), then back again into an iterator form, then finally into an array.
See it on the playground.
If you wonder why we have to have to go back and forth between these iterator representations, it's because they are the "interface" Rust uses to transform any datatype into any other datatype, very efficiently and while allowing you to perform some operations along the way easily. Here we don't actually need to do anything more than the conversion so that's why it may seem a little bit verbose.
Using the itertools crate
The itertools crate provides utilities to work on iterators (the same that we use as an interface to convert between datatypes). However, a peculiarity of iterators is that they are lazy, in a way, in the sense that they, in themselves, are not a datatype used to store information. They only represent operations performed, through the iterable interface, on a collection. For this reason, you actually need to transform an iterator back into a usable collection (or consume it in any way), otherwise it will do nothing (literally).
So the correct version of your code would probably be
let a_vector = vec![7, 0, 3, 6, 7, 7, 1, 1, 8, 6];
let uniqued_vector = a
.into_iter()
.unique()
.collect::<Vec<_>>();
You don't need to sort anything because, internally, .unique() works pretty much like the first implementation.
Sorting the array
As said earlier, sorting the array is fine, so you might still want to do that. However, unlike previous solutions, this won't involve only iterators because you can't sort an iterator (there is no such method provided by the Iterator trait, nor by the actual type produced by a_vector.into_iter())! However, once you have sorted the array, you may want to deduplicate it, that is, remove consecutive repetitions, which is also not provided by the Iterator trait. However, both of these are actually simply provided by Vec, so the solution is simply:
let mut a_vector = vec![7, 0, 3, 6, 7, 7, 1, 1, 8, 6];
a_vector.sort();
a_vector.dedup();
And then a_vector contains unique elements.
Note that this is only true if you use only the standard library. Itertools provides both a sorted method and a dedup one, so with itertools you could do:
let a_vector = vec![7, 0, 3, 6, 7, 7, 1, 1, 8, 6];
let uniqued_vector = a_vector
.into_iter()
.sorted()
.dedup()
.collect::<Vec<_>>();
But at this point you'd be better off using .unique().
If you wonder what the difference between .iter() and .into_iter(), see this question.

Inspired from the example of retain(), we can rely on an intermediate sequence of booleans (see solution_1() below).
I don't like very much the idea of an intermediate storage, but I guess (I should have benchmarked, but I did not) that this simple vector of booleans is less expensive than creating a HashSet or sorting.
For an in-place solution, I'm afraid we have to write the algorithm by ourselves.
The trick relies on split_at_mut() in order to avoid a multiple-borrows issue (see solution_2() below).
use rand::{thread_rng, Rng};
fn solution_1(mut arr: Vec<u8>) -> Vec<u8> {
println!("random {:?}", arr);
let keep: Vec<_> = arr
.iter()
.enumerate()
.map(|(i, x)| !arr[0..i].contains(x))
.collect();
let mut keep_iter = keep.iter();
arr.retain(|_| *keep_iter.next().unwrap());
println!("unique {:?}", arr);
arr
}
fn solution_2(mut arr: Vec<u8>) -> Vec<u8> {
println!("random {:?}", arr);
let mut kept = 0;
for i in 0..arr.len() {
let (head, tail) = arr.split_at_mut(i);
let x = tail.first_mut().unwrap();
if !head[0..kept].contains(x) {
if kept != i {
std::mem::swap(&mut head[kept], x);
}
kept += 1;
}
}
arr.resize(kept, Default::default());
println!("unique {:?}", arr);
arr
}
fn main() {
let mut arr: Vec<u8> = Vec::new();
for _ in 0..20 {
arr.push(thread_rng().gen_range(0..10))
}
assert_eq!(solution_1(arr.clone()), solution_2(arr));
}
/*
random [2, 3, 6, 8, 4, 1, 1, 9, 1, 9, 1, 6, 2, 5, 5, 4, 0, 0, 5, 4]
unique [2, 3, 6, 8, 4, 1, 9, 5, 0]
random [2, 3, 6, 8, 4, 1, 1, 9, 1, 9, 1, 6, 2, 5, 5, 4, 0, 0, 5, 4]
unique [2, 3, 6, 8, 4, 1, 9, 5, 0]
*/

Several things:
.unique() returns an iterator. You have to turn it back into a vector with .collect(). This section of the Rust book contains the basic rules around using iterators
.sort() modifies the vector in place. Unit type means an expression has no type, functions can sometimes not have any output and just have side effects (like modyfing something in place).
Try:
arr.sort();
let uniques = arr.iter().unique().collect()

If you don't need the original insertion order you can get the array sorted w/o duplicates:
let unique = std::collections::BTreeSet::from([7, 0, 3, 6, 7, 7, 1, 1, 8, 6]);
let arr = unique.into_iter().collect::<Vec<_>>();
println!("{arr:?}");
Playground

Related

How to get a pointer (instead of a copy) to an array in Swift?

In Swift, assigning an array to a new variable actually makes of copy. For example (as in Apple doc for Array):
var numbers = [1, 2, 3, 4, 5]
var numbersCopy = numbers
numbers[0] = 100
print(numbers)
// Prints "[100, 2, 3, 4, 5]"
print(numbersCopy)
// Prints "[1, 2, 3, 4, 5]"
How do I actually get a pointer to the same array, so modifying the elements is reflected in the same array? (The reason for this is I access in static instances of another class, e.g. "SomethingManager.sharedInstance.arrayList[aKey]" and I'll like to shorten it to an assigned pointer variable.)
(I'm interested to know how to do this in Swift 4 and 5. I don't see any existing question for Swift language.)
EDIT:
I'm providing my rationale for the need to have a pointer instead of a copy.
Say, I have the following code:
var childrenTasks = [Int64: [TaskRef]]()
defined in a class, which is accessed:
MyClass.singleton.parentTask[parentTaskID].childrenTask[taskRefID]
As you can see that the code to access childrenTask is very long. I'd like to have a pointer, just an illustration :-
var aPointerToChildrenTasks = MyClass.singleton.parentTask[parentTaskID].childrenTask[taskRefID] // I want a pointer, not a copy!
aPointerToChildrenTask.remove(at: anIndex) // if it is a pointer, I can manipulate the same set of values of the array
It will help make my code easier to read. I need a pointer to manipulate the same set of values so I use a "var". If it is only read-only, I can use a "let", but still it has performance penalty if I get a copy.
How do I get a pointer in Swift? Is this possible? (I know that in Kotlin it is possible as it is pass-by reference.)
EDIT: I see some suggestion that this question is a duplicate. No, it is not. Those other questions/answers are specifically focused on inout parameters. For my case, I just want a pointer to work in the same function/method.
Not a ‘pure’ Swift solution, but using NSArray will give you the reference semantics you desire.
NSArray is toll-free bridgeable to Array, so you can use plain as instead of as!
var numbers = [1, 2, 3, 4, 5]
var numbersCopy = numbers as NSArray
numbers[0] = 100
print(numbers)
[100, 2, 3, 4, 5]
print(numbersCopy as Array)
[1, 2, 3, 4, 5]
If you are modifying the 'copy' you will need to use a NSMutableArray.
Edit:
oops 🤭
I think I was confused by the naming of your variable numbersCopy. I see now that you want the 'copy' to share the same value as the original. By capturing the variable numbers in a block, and executing that block later, you can get the current value of numbers, and you don't need to use NSArray at all.
var numbers = [1, 2, 3, 4, 5]
var numbersCopy = {numbers}
numbers[0] = 100
print(numbers)
[100, 2, 3, 4, 5]
print(numbersCopy())
[100, 2, 3, 4, 5]
If it's just about convenience, consider making a utility function like this:
func withChildrenTasks(of parentTaskID: Int64, taskRefID: TaskRef, body: (inout [TaskRef]) -> ()) {
body(&MyClass.singleton.parentTask[parentTaskID].childrenTasks[taskRefID])
}
withChildrenTasks(of: parentTaskID, taskRefID: taskRefID) { tasks in
// do stuff with tasks
}
You can't create an "inout var", but you can always make a callback that accepts an inout parameter, so this is an easy workaround. I expect that the Swift compiler would be pretty good about optimizing it away.
If it's because you actually want to share the array reference, you will either need to wrap it in a reference type (class SharedArray<T> { var array = [T]() } might be enough for that purpose), or you could use NSMutableArray from Foundation.
Use a computed property:
var numbers = [1, 2, 3, 4, 5]
var numbersCopy: [Int] {
get { numbers }
set { numbers = newValue }
}
numbers[0] = 100
print(numbers)
// Prints "[100, 2, 3, 4, 5]"
print(numbersCopy)
// Prints "[100, 2, 3, 4, 5]"
numbersCopy[1] = 200
print(numbers)
// Prints "[100, 200, 3, 4, 5]"
print(numbersCopy)
// Prints "[100, 200, 3, 4, 5]"

Does ruby support an enumerable map_cons method or its equivalent?

Ruby has a handy function for enumerables called each_cons. Which "Iterates the given block for each array of consecutive elements." This is really nice. Except that this is definitely an each method, which returns nil upon completion and not an array of the values you've looped over like map would.
However, if I have a situation where I need to iterate over an enumerable type, take an element and its cons, then perform some operation on them and return them back into an array what can I do? Normally, I'd use map for this sort of behavior. But map_cons doesn't exist.
An example:
Given a list of integers, I need to see which ones of those integers repeat and return a list of just those integers
[1, 1, 4, 5, 6, 2, 2] ## I need some function that will get me [1, 2]
I can say:
[1, 1, 4, 5, 6, 2, 2].each_cons(2) {|e| e[0] if e[0] == e[1]}
But, since it eachs over the array, it will complete successfully and return nil at the end. I need it to behave like map and not like each.
Is this behavior something that ruby supports? Am I coming at it from the wrong direction entirely?
The documentation of each_cons ends with this innocent phrase: "If no block is given, returns an enumerator." Most methods of Enumerable do this. What can you do with an Enumerator? Nothing truly impressive . But Enumerators include Enumerable, which does provide a large amount of powerful methods, map being one of them. So, as Stefan Pochmann does:
[1, 1, 4, 5, 6, 2, 2].each_cons(2).map { |e| e[0] if e[0] == e[1] }
each_consis called without a block, so it returns an Enumerator. mapis simply one of its methods.
Just add map?
[1, 1, 4, 5, 6, 2, 2].each_cons(2).map { |e| e[0] if e[0] == e[1] }
=> [1, nil, nil, nil, nil, 2]
Ruby 2.7 added the method filter_map which makes this even easier:
p [1, 1, 4, 5, 6, 2, 2].each_cons(2).filter_map{|a,b| a if a == b} # => [1, 2]

Swift create one Int array from two int arrays by taking the max() at each index

I feel like this may call for reduce, map or something like it to solve but I'm not yet familiar enough with these and was hoping someone here might be. Lets say I have
arrayOne = [1, 3, 7]
arrayTwo = [2, 1, 10]
the expected result for what I'm trying to do would be
mergedArray = [2, 3, 10]
I know I can do this with a relatively simple for loop in a method but I am looking for a more "swift" way to do it if it's possible.
And Yes, both arrays will always be the same length.
This will work:
let arrayOne = [1, 3, 7]
let arrayTwo = [2, 1, 10]
let mergedArray = zip(arrayOne, arrayTwo).map{max($0, $1)}
First, pair each element in two arrays with zip, and then use map to each pair.

When iterating through an array in Swift, is there a reference the index/iteration of the loop?

Say I have the following code:
var array = [1, 5, 6, 2, 7, 4]
for item in array {
println(index)
}
Is there a way to access what iteration the loop is on within the loop's body? (i.e.: 0, 1, 2, 3...)
var array = [1, 5, 6, 2, 7, 4]
for (index,item) in enumerate(array) {
println("\(index): \(item)")
}
You can create a count variable. Other than that, you need to use a regular for loop.

Sort array in Scala partially

How can I sort a region in an array in Scala?
I have an array, say var DivA = new Array[Int](100005).
I filled the elements in DivA up to some position which is less than 100005.
For example my array looks like Diva = {1,2,10,5,15,20}.
In C++ we have sort(DivA, DivA + N), but how can we sort arrays in Scala up to a certain index?
I tried Sorting.quickSort(DivA), but I don't know how to define the index up to which I want to sort the array, so that the statement above sorts the array, and the array looks like DivA={0,0,0,0,0}?
var Diva=new Array[Int](10000);
for(j<-15 to 0 by -1){
DivA(posA)=j;
posA=posA+1;
}
Sorting.quickSort(DivA);
for(j<-0 to posA-1)
{
print(DivA(j));
print(" ");
}
If you want to sort a region in an array, you can use the sort method in java.util.Arrays:
java.util.Arrays.sort(DivA, 0, 15)
this method offers fromIndex and toIndex. Note that this method sorts in place. That's probably OK in your case, as the style you apply seems to be "imperative".
If performance matters, arrays will probably be faster than other collections. Other reasons for using arrays might be memory consumption or interoperability.
Otherwise you can probably design your data structures/algorithm in a different (more "functional") way by using idiomatic Scala collections.
You can use sorted or sortBy, depending on your needs. In case of simple Array[Int], sorted should be enough:
val arr = Array.iterate(15, 15)(_ - 1)
arr: Array[Int] = Array(15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
val sortedArray = arr.sorted
sortedArray: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
You can provide your own Ordering typeclass if you want some other kind of order.
You can find it in the documentation.
Although, as was noted in other answer, you should prefer immutable Vector unless you have good reason to switch to Array.
sorted() method is common for all collections, and Arrays in Scala are implicitly converted to a collection.
So you can call sorted on Array[Int], eg:
scala> val p = (10 to 1 by -1).toArray
p: Array[Int] = Array(10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
scala> p.sorted
res5: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
First, if you are programming in Scala, you should think about Vector or List because those are immutable which is prefferd way of programming in Scala...
And for Vector and List there is sorted() function which sorts an array

Resources