Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: invalid operation on untyped list when running timeseries example #8

Open
benjaminh opened this issue Oct 11, 2024 · 3 comments

Comments

@benjaminh
Copy link

Hello,

I was trying to compute metrics for clustering on timeseries with DTW (done with tslearn library), and ended on your project !

I first encountered issues for my own clustering dataset (something with inf values), so I tried the examples provided in your documentation : https://pycvi.readthedocs.io/en/latest/examples/basic_usage_time_series.html

When running the code, I get the following error :

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[38], line 28
     23 for k in k_range:
     24 
     25     # Generate the clusters assuming that there are k clusters
     26     # Clustering model to use, could be any sklearn-like clustering class
     27     model = TimeSeriesKMeans(n_clusters=k)
---> 28     labels_pred = model.fit_predict(X)
     30     # From predicted cluster-label for each datapoint to a list of
     31     # datapoints for each cluster.
     32     clusters_pred = get_clustering(labels_pred)

[...]

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/numba/typed/typedlist.py:283, in List._numba_type_(self)
    280 @property
    281 def _numba_type_(self):
    282     if self._list_type is None:
--> 283         raise TypeError("invalid operation on untyped list")
    284     return self._list_type

TypeError: invalid operation on untyped list

I suspected an issue due to a different version of numba or aeon, but I do have

  • python 3.11.8
  • aeon 0.11.1
  • numba 0.60.0

Any ideas what is causing this issue ? Do I have to cast some type on the input dataset vector ?

@nglm
Copy link
Owner

nglm commented Oct 14, 2024

Hi @benjaminh ! First of all, thank you for using PyCVI and for taking the time to share your issue!

I tried to run the basic_usage_time_series example again with:

  • python 3.11.10 (Not 3.11.8 though)
  • aeon 0.11.1
  • numba 0.60.0

And everything works on my side, so I don't know what could be causing this, but it doesn't seem to be coming from aeon nor numba. So we'll have to figure that out together.

Could you share the following information:

  • Just to be sure, could you confirm that you used exactly the example provided in the documentation (with the pycvi_examples_utils file) without changing anything in there, and, in particular the dataset used? (In other words, when you encountered the error, you were not using your own clustering dataset.)
  • Just to be sure again, could you confirm that you are not using tslearn here at all? (PyCVI is at the moment not compatible with tslearn, only scikit-learn, scikit-learn extra, aeon and sktime)
  • Could you share the whole output you get before getting the error?

@benjaminh
Copy link
Author

Hello,

thanks for the feedback, here are some additional information :

  • I do use the example provided in https://pycvi.readthedocs.io/en/latest/examples/basic_usage_time_series.html. I only skipped the plotting part so I do not have to copy the pycvi_examples_utils content, and make sure to use aeon TimeSeriesKMeans class and not tslearn one.
  • I do have tslearn installed in my virtual environment, but I did not use it when executing the example code snippet, see. Traceback below to confirm that
  • Here is the full traceback displayed
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[24], line 28
     23 for k in k_range:
     24 
     25     # Generate the clusters assuming that there are k clusters
     26     # Clustering model to use, could be any sklearn-like clustering class
     27     model = aeon.clustering.TimeSeriesKMeans(n_clusters=k)
---> 28     labels_pred = model.fit_predict(X)
     30     # From predicted cluster-label for each datapoint to a list of
     31     # datapoints for each cluster.
     32     clusters_pred = get_clustering(labels_pred)

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/aeon/clustering/base.py:141, in BaseClusterer.fit_predict(self, X, y)
    123 def fit_predict(self, X, y=None) -> np.ndarray:
    124     """Compute cluster centers and predict cluster index for each time series.
    125 
    126     Convenience method; equivalent of calling fit(X) followed by predict(X)
   (...)
    139         Index of the cluster each time series in X belongs to.
    140     """
--> 141     self.fit(X)
    142     return self.predict(X)

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/aeon/clustering/base.py:58, in BaseClusterer.fit(self, X, y)
     56 _start_time = int(round(time.time() * 1000))
     57 X = self._preprocess_collection(X)
---> 58 self._fit(X)
     59 self.fit_time_ = int(round(time.time() * 1000)) - _start_time
     60 self._is_fitted = True

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/aeon/clustering/_k_means.py:203, in TimeSeriesKMeans._fit(self, X, y)
    201 for _ in range(self.n_init):
    202     try:
--> 203         labels, centers, inertia, n_iters = self._fit_one_init(X)
    204         if inertia < best_inertia:
    205             best_centers = centers

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/aeon/clustering/_k_means.py:235, in TimeSeriesKMeans._fit_one_init(self, X)
    233 prev_labels = None
    234 for i in range(self.max_iter):
--> 235     curr_pw = pairwise_distance(
    236         X, cluster_centres, metric=self.distance, **self._distance_params
    237     )
    238     curr_labels = curr_pw.argmin(axis=1)
    239     curr_inertia = curr_pw.min(axis=1).sum()

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/aeon/distances/_distance.py:402, in pairwise_distance(x, y, metric, **kwargs)
    393     return twe_pairwise_distance(
    394         x,
    395         y,
   (...)
    399         kwargs.get("itakura_max_slope"),
    400     )
    401 elif metric == "msm":
--> 402     return msm_pairwise_distance(
    403         x,
    404         y,
    405         kwargs.get("window"),
    406         kwargs.get("independent", True),
    407         kwargs.get("c", 1.0),
    408         kwargs.get("itakura_max_slope"),
    409     )
    410 elif metric == "mpdist":
    411     return _custom_func_pairwise(x, y, mpdist, **kwargs)

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/aeon/distances/_msm.py:430, in msm_pairwise_distance(X, y, window, independent, c, itakura_max_slope)
    425     return _msm_pairwise_distance(
    426         _X, window, independent, c, itakura_max_slope, unequal_length
    427     )
    429 _y, unequal_length = _convert_to_list(y, "y", multivariate_conversion)
--> 430 return _msm_from_multiple_to_multiple_distance(
    431     _X, _y, window, independent, c, itakura_max_slope, unequal_length
    432 )

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/numba/core/dispatcher.py:688, in _DispatcherBase.typeof_pyval(self, val)
    685 # Not going through the resolve_argument_type() indirection
    686 # can save a couple µs.
    687 try:
--> 688     tp = typeof(val, Purpose.argument)
    689 except ValueError:
    690     tp = types.pyobject

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/numba/core/typing/typeof.py:33, in typeof(val, purpose)
     31 # Note the behaviour for Purpose.argument must match _typeof.c.
     32 c = _TypeofContext(purpose)
---> 33 ty = typeof_impl(val, c)
     34 if ty is None:
     35     msg = _termcolor.errmsg(
     36         f"Cannot determine Numba type of {type(val)}")

File ~/miniconda3/envs/ds/lib/python3.11/functools.py:909, in singledispatch.<locals>.wrapper(*args, **kw)
    905 if not args:
    906     raise TypeError(f'{funcname} requires at least '
    907                     '1 positional argument')
--> 909 return dispatch(args[0].__class__)(*args, **kw)

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/numba/core/typing/typeof.py:50, in typeof_impl(val, c)
     47 if tp is not None:
     48     return tp
---> 50 tp = getattr(val, "_numba_type_", None)
     51 if tp is not None:
     52     return tp

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/numba/typed/typedlist.py:283, in List._numba_type_(self)
    280 @property
    281 def _numba_type_(self):
    282     if self._list_type is None:
--> 283         raise TypeError("invalid operation on untyped list")
    284     return self._list_type

TypeError: invalid operation on untyped list

I also tried to inspect the input dataset

# -------------- Standard data handling operations ---------------------
# Load data
data, labels = load_data("Trace", "ucr")
(N, T, d) = data.shape

# Data pre-processing
scaler = MinMaxScaler()
# Scaling for each variable and not time step wise
X = scaler.fit_transform(data.reshape(N*T, d)).reshape(N, T, d)
print(f"X is of type {type(X)}, with shape {N}, {T}, {d}")

It outputs X is of type <class 'numpy.ndarray'>, with shape 99, 275, 1. Does it looks correct ?

@nglm
Copy link
Owner

nglm commented Oct 23, 2024

Data type and shape seem normal and unfortunately, I can't see anything in the output that could indicate where the error comes from. Would you mind sharing your whole script?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants