Fixed #1112 Migrate to New NumPy Random Number Generator (RNG) API#1131
Fixed #1112 Migrate to New NumPy Random Number Generator (RNG) API#1131seanlaw wants to merge 7 commits intostumpy-dev:mainfrom
Conversation
|
Review these changes at https://app.gitnotebooks.com/stumpy-dev/stumpy/pull/1131 |
|
@NimaSarajpoor I've tested the migration on |
NimaSarajpoor
left a comment
There was a problem hiding this comment.
@seanlaw
I've left just one comment for your consideration.
|
@NimaSarajpoor If it's okay with you, I'm going to add ~5 files at a time for you to incrementally review (there are 46 test files in total and I'll need to work on tutorials and other docs too). |
|
Things are starting to get tricky. Previously, |
|
@NimaSarajpoor I added three new files for you to review |
| # naive.distance(naive.z_norm(T[10:30]), naive.z_norm(T[110:130])) = 0.47 | ||
| # naive.distance(naive.z_norm(T[10:30]), naive.z_norm(T[210:230])) = 0.24 | ||
| # naive.distance(naive.z_norm(T[110:130]), naive.z_norm(T[210:230])) = 0.72 | ||
| # Hence T[10:30] is the motif representative for this motif |
There was a problem hiding this comment.
I think these comments should be revised. They show the calculation of z-normalized distances. However, aamp_motifs is for non-normalized case.
There was a problem hiding this comment.
Hmm, what should it be changed to? I don't remember adding this comment :(
This?
# naive.distance(T[10:30], T[110:130]) = 0.47
# naive.distance(T[10:30], T[210:230]) = 0.24
# naive.distance(T[110:130], T[210:230]) = 0.72
# Hence T[10:30] is the motif representative for this motif
| ) | ||
| motif_indices_ref = np.array([[19, 77, 63, 52, 71]]) | ||
| motif_subspaces_ref = [np.array([2])] | ||
| motif_distances_ref = np.array([[0.0, 0.10660435, 0.27927693]]) |
There was a problem hiding this comment.
I cannot figure out how these numbers are obtained. Since tests are passing, I guess all is good here. I guess these numbers are revised to match the actual output. So... an off-topic question: are we cheating here by using ref values that are based on the calculation of outputs obtained from performant function ?
There was a problem hiding this comment.
So... an off-topic question: are we cheating here by using ref values that are based on the calculation of outputs obtained from performant function ?
Yes but the answer is nuanced:
- In traditional unit testing, nobody implements a "naive" version of the algorithm and, instead, they hardcode a reference answer
- The reason why we didn't implement a "naive" version was either because the naive implementation was too slow OR too complicated/not worth it
- While we may be cheating (this is true), nonetheless, if somebody breaks
aamp_mmotifs(to produce incorrect results) then this test should still fail
Of course, it is completely possible that aamp_mmotifs is currently broken and our hardcoded answer leads to a bad test... Though, I have no reason to believe this.
| # * (almost) identical step functions at indices 10, 110 and 210 | ||
| # * identical linear slopes at indices 70 and 170 | ||
| T = np.random.normal(size=300) | ||
| T = pytest.RNG.normal(size=300) |
There was a problem hiding this comment.
To be consistent with other test functions, I was wondering if we should go with random rather than normal distribution.
There was a problem hiding this comment.
Yeah, I wasn't sure if there was a reason to use draw from a normal distribution? If not then I can change it
| def test_random_ostinatoed(seed, dask_cluster): | ||
| with Client(dask_cluster) as dask_client: | ||
| def test_deterministic_ostinato(): | ||
| pytest.fix_rng_state() |
There was a problem hiding this comment.
Do we need to fix state here? I can see we do not fix state for test_random_ostinato, and that should be OK.
There was a problem hiding this comment.
Since the test says "deterministic", I was inclined to keep ensure that the inputs would be identical over time. If that weren't the case then it seems like the deterministic tests could be removed.
There was a problem hiding this comment.
The "deterministic" tests were likely put in place to resemble the stump tests where we have some hardcoded (deterministic) test_data and then, later, we allow random data.
|
|
||
| @pytest.mark.parametrize("seed", [41, 88, 290, 292, 310, 328, 538, 556, 563, 570]) | ||
| def test_deterministic_ostinatoed(seed, dask_cluster): | ||
| pytest.fix_rng_state() |
There was a problem hiding this comment.
Do we need to fix state here? I can see we do not fix state for test_random_ostinato.
There was a problem hiding this comment.
Please see comments above:
#1131 (comment)
#1131 (comment)
|
I discovered yesterday that this migration will require a lot more work since, unlike the traditional As a result, I think I'm gonna need to start all over with a new PR and rethink what needs to be done |
Pull Request Checklist
Below is a simple checklist but please do not hesitate to ask for assistance!
black(i.e.,python -m pip install blackorconda install -c conda-forge black)flake8(i.e.,python -m pip install flake8orconda install -c conda-forge flake8)pytest-cov(i.e.,python -m pip install pytest-covorconda install -c conda-forge pytest-cov)black --exclude=".*\.ipynb" --extend-exclude=".venv" --diff ./in the root stumpy directoryflake8 --extend-exclude=.venv ./in the root stumpy directory./setup.sh dev && ./test.shin the root stumpy directoryTo Do List
np.randomnp.random.rand(x)withpytest.RNG.random(x)np.random.rand(x*y).reshape(x,y)withpytest.RNG.random(size=(x, y))np.random.uniform(x, y, [z])withpytest.RNG.uniform(x, y, size=z)np.random.permutation([x, y, z])withpytest.RNG.permutation([x, y, z])np.random.randint(x, y, z)withpytest.RNG.integers(x, y, z)np.random.choice([x, y], l, replace=True)withpytest.RNG.choice([x, y], l, replace=True)np.random.normal(size=x)withpytest.RNG.normal(size=x)np.randomfrom docstrings, README, tutorials, etc