-
Notifications
You must be signed in to change notification settings - Fork 58
Description
Hello!
I have been playing around with diffsims' ReciprocalLatticeVector.unique, which uses Miller.unique, which again uses Object3d.unique.
Specifically, I needed to be able to re-construct the initial array from the unique ones, so I specified "return_inverse=True".
This led to some issues with the ordering of the elements, since the returned data from Object3d.unique sorts the indices of the unique elements, without the necessary changes to the array of indices to reconstruct the original.
In the process of addressing this, I also fixed the symmetry-aware Miller.unique, so initial order is preserved here as well. However, that broke some tests, in particular the final assertion in orix/tests/test_vector3d/test_miller.py::TestMiller::test_unique. This line checks the indices of the first 10 unique elements in a large Miller object, and all these indices have fairly large and unexpectedly disorganized values (65, 283, 278 ect.)
By changing the "idx" and "inv" to preserve the order in the input when using Object3d.unique and Miller.unique, the indices in the mentioned test becomes [0, 1, 2, ....] instead.
Another test that broke was orix/tests/test_quaternion/test_orientation_region.py::test_get_large_cell_normals. I did not investigate this one further, but from what I can see it is a lot of raw data that gets compared to computed data. The order of the pre-computed data might very well be intentional, and not a consequence of Object3d.unique.
I personally think it better that the order of the input is preserved, but as this seems to be an old implementation I worry it will break things other places too.
Edit: To illustrate:
from numpy import array_equal
from orix.vector import Vector3d
# Make some random vector with duplicates
rnd = Vector3d.random(30)
rnd.data = rnd.data.round(5)
v = Vector3d.stack([rnd.get_random_sample(10, replace=True) for _ in range(10)]).flatten()
# Get unique elements and reconstruction arrays
u, idx, inv = v.unique(return_index=True, return_inverse=True)
# Check that there are duplicates
assert v.size > u.size
# Check that all elements are present in both, ignoring order
def unordered_equal(vec1: Vector3d, vec2: Vector3d) -> bool:
assert vec1.shape == vec2.shape
# Use tuples so all three elements are checked
vec1_tuples = [tuple(v.data.squeeze().tolist()) for v in vec1]
vec2_tuples = [tuple(v.data.squeeze().tolist()) for v in vec2]
# Check both ways
return all(v1 in vec2_tuples for v1 in vec1_tuples) and all(v2 in vec1_tuples for v2 in vec2_tuples)
# Passes
assert unordered_equal(v[idx], u)
assert unordered_equal(v, u[inv])
assert unordered_equal(v[idx][inv], v)
# idx and inv are self-consistent
assert array_equal(v[idx][inv].data, v.data)
# Fails
assert array_equal(v[idx].data, u.data)
assert array_equal(v.data, u[inv].data)