Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: [RAY] ray initialisation sets _memory and object_store_memory to the same value, leading to crashes and less flexibility #7361

Open
2 of 3 tasks
Liquidmasl opened this issue Aug 6, 2024 · 2 comments
Labels
bug 🦗 Something isn't working Triage 🩹 Issues that need triage

Comments

@Liquidmasl
Copy link

Liquidmasl commented Aug 6, 2024

Modin version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest released version of Modin.

  • I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)

Reproducible Example

# set a breakpoint in ...\modin\core\execution\ray\common\utils.py line 138

import modin.pandas as pd
df = pd.DataFrame()

# check the contents of ray_init_kwargs
#or really just look at the code there:

            # object_store_memory = _get_object_store_memory()
            # ray_init_kwargs = {
            #     "num_cpus": CpuCount.get(),
            #     "num_gpus": GpuCount.get(),
            #     "include_dashboard": False,
            #     "ignore_reinit_error": True,
            #     "object_store_memory": object_store_memory,
            #     "_redis_password": redis_password,
            #     "_memory": object_store_memory,
            #     "resources": RayInitCustomResources.get(),
            #     **extra_init_kw,
            # }

Issue Description

modin sets _memory and object_store_memory to the same value. This not only leads to instability and crashes, but it also reduces the flexibility as _memory can be set to a value higher then the shared memory while object_store_memory cannot.

A lot of the issues I faced the last few days with read_parquet() (althrough, this still fills up RAM until my pc crashes), to_parquet(), concat(), etc etc stemmed from the issue that when the object store was full and a spill was attempted, a write violation happend, and a raylet died.

I noticed that modin runs a lot more stable when ray.init() was called manually. This is because there the two values are not set to the same value per default.

Also, it would be great if the ray dashboard was not disabled per default, without being able to enable it when initialising with modin. But I digress.

Expected Behavior

If no manual configuration was done, or env variables where set, the default ray init should be used.
And if not default, then not something this debilitating.

After initializing ray manually and just setting _memory to something way larger, stuff just started working.
While setting MODIN_MEMORY to something higher when using modins initialisation did not work, because it lead to a value error from RAY stating that object_store_memory cant be set that high (even though I did never care about the object_store_memory.

Error Logs

Replace this line with the error backtrace (if applicable).

Installed Versions

INSTALLED VERSIONS

commit : c8bbca8
python : 3.11.8.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22631
machine : AMD64
processor : Intel64 Family 6 Model 186 Stepping 2, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_Austria.1252
Modin dependencies

modin : 0.31.0
ray : 2.34.0
dask : 2024.7.1
distributed : 2024.7.1
pandas dependencies

pandas : 2.2.2
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
setuptools : 68.2.2
pip : 24.1.2
Cython : 0.29.37
pytest : 8.2.0
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.3
IPython : 8.23.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2024.3.1
gcsfs : None
matplotlib : 3.8.2
numba : 0.60.0
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 15.0.2
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.13.0
sqlalchemy : 2.0.29
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None

@Retribution98
Copy link
Collaborator

Hi @Liquidmasl

Modin has these default values because it helps to achieve good performance in general.
If you have a specific case and Modin's configuration variables don't help you, you can initialize ray yourself.

@Liquidmasl
Copy link
Author

I see.
I understand my experience does not stand by any means for everyone. But with these defaults I had numerous bluescreens, freezes and crashes. All in all making debugging and figuring this out a lot more troublesome then necessary.

I did not want to initialize ray myself for the exact cause that I thought modin will know best, but it did give me no option to just adapt the two values that lead to issues for me (_memory and include_dashboard)

if you think the current defaults work fine most of the time and my situation is an outlier, fair enough!
I still think introducing config params or env vars that give the option to set _memory, object_store_memory and include_dashboard manually while still relying on modins ray initialisation would be good.
As I understood its a relatively new feature of modin that it initialises ray itself. So maybe there will be some changes along the way anyway. For now, now that I understand that, its fine to initialize ray manually

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working Triage 🩹 Issues that need triage
Projects
None yet
Development

No branches or pull requests

2 participants