Inconsistencies while running PySR on local computer VS remote cluster #1017
Replies: 3 comments 11 replies
-
You have to use Also, how are you specifying the random state? |
Beta Was this translation helpful? Give feedback.
-
I followed your advise and managed to isolate/put the issue in a script below that only depends on numpy=2.24 and pysr=1.5.8 from pysr import PySRRegressor
from utils_dataset import variable_names, X_units, y_units, X, y
from utils_model import params
model = PySRRegressor(**params)
model.fit(X, y, variable_names=variable_names, X_units=X_units, y_units=y_units)
print(f'{model.equations_} equations with complexity: {model.equations_['complexity'].to_list()}') and this script prints:
I put this script in a python file main.py and it simply requires to have the following python files in the same folder: one python file for the data utils_dataset.py, the other with parameters for PySR utils_model.py This later file may be the most interesting for you as it contains 'params' the dictionary specifying PySR parameters. Maybe from this file, you will be able to spot which parameters generate this weird behavior, or which obvious mistake I might have made. Thanks in advance for your feedback ! |
Beta Was this translation helpful? Give feedback.
-
I will create a minimal reproducible example using only PySRRegressor (and not a custom subclass) and with as much default parameter of PySRRegressor, and open a "bug" for this minimal reproducible example. In the meantime, I will close this discussion. Thanks again for your answers |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,
I am trying to run PySR on a remote cluster (set up with Ubuntu 24.04.2, just like my local computer).
I have pulled my github code on the cluster, set up a virtual environment exactly the same as on my local computer (python 3.12.3, pip 25.2, julia 1.11.6, PySR 1.5.8).
I run tests (that are successful on my local computer) on the cluster. These tests basically fit a PySRRegressor to different types of data.
Most tests are successful, but some fail. In particular, when I look at one the failing case, I notice that the 'equation_' dataframes are not even of the same size: 15 for the remote cluster, 13 for the local computer (even though the parameters of PySR, the data and the environment are the same between the remote cluster and the local computer). Note that the random state in PySR is fixed so the issue does not come from that.
My question is: what have I missed ? Is there something else that I should check on the remote cluster to ensure that the virtual environment is exactly the same as on my local computer ? (Because after some printing, I am already certain that the data and parameters are exactly the same, so the issue is likely due to an issue of environment, or could there be another reason ?).
PS: For the test that fails, the input data and PySR parameters are quite complex. This is why I preferred to ask this question without providing any code/example.
Beta Was this translation helpful? Give feedback.
All reactions