Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PowerPC issue and protobuf #506

Open
ellisja opened this issue May 10, 2022 · 2 comments
Open

PowerPC issue and protobuf #506

ellisja opened this issue May 10, 2022 · 2 comments
Assignees

Comments

@ellisja
Copy link

ellisja commented May 10, 2022

Hi!
I am trying to get NVFlare (v2.0.15) running on the Summit machine at Oak Ridge National Lab (IBM Power9 / V100), and seem to be running into an issue with server startup. It's coming from protobuf (v3.20.1) and had wondered if you had heard or experienced this before? I'm not able to find any info on stackoverflow or github, otherwise.

bash-4.4$ jsrun -n1 ./poc/server/startup/start.sh
WORKSPACE set to /gpfs/alpine/stf018/proj-shared/jaellis2/privacy/examples/poc/server/startup/..
sub_start: WORKSPACE set to /gpfs/alpine/stf018/proj-shared/jaellis2/privacy/examples/poc/server/startup/..
sub_start: Prepare to train nvflare server
Traceback (most recent call last):
  File "/ccs/home/jaellis2/.conda/envs/smartsim/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/ccs/home/jaellis2/.conda/envs/smartsim/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/ccs/home/jaellis2/.conda/envs/smartsim/lib/python3.8/site-packages/nvflare/private/fed/app/server/server_train.py", line 27, in <module>
    from nvflare.private.fed.app.fl_conf import FLServerStarterConfiger
  File "/ccs/home/jaellis2/.conda/envs/smartsim/lib/python3.8/site-packages/nvflare/private/fed/app/fl_conf.py", line 24, in <module>
    from nvflare.private.fed.client.base_client_deployer import BaseClientDeployer
  File "/ccs/home/jaellis2/.conda/envs/smartsim/lib/python3.8/site-packages/nvflare/private/fed/client/base_client_deployer.py", line 18, in <module>
    from nvflare.private.fed.client.fed_client import FederatedClient
  File "/ccs/home/jaellis2/.conda/envs/smartsim/lib/python3.8/site-packages/nvflare/private/fed/client/fed_client.py", line 28, in <module>
    from ..utils.numproto import proto_to_bytes
  File "/ccs/home/jaellis2/.conda/envs/smartsim/lib/python3.8/site-packages/nvflare/private/fed/utils/numproto.py", line 21, in <module>
    from nvflare.private.fed.protos.federated_pb2 import NDArray
  File "/ccs/home/jaellis2/.conda/envs/smartsim/lib/python3.8/site-packages/nvflare/private/fed/protos/federated_pb2.py", line 19, in <module>
    from google.protobuf import descriptor as _descriptor
  File "/ccs/home/jaellis2/.conda/envs/smartsim/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 47, in <module>
    from google.protobuf.pyext import _message
TypeError: bases must be types
System is in trouble and unable to start the task!!!!!

I needed to follow these instructions to install NVFlare without tenseal: #130

Separately, I have NVFlare running on our x86 clusters just fine. I can give plenty more detail on my env if needed. Thanks for any help!

@IsaacYangSLA
Copy link
Collaborator

It seemed the issue was from protobuf. We need to have PowerPC environment to duplicate it.

@chesterxgchen
Copy link
Collaborator

@ellisja can you probably can check if the protobuf version used works on the PowerPC, as separate test, and if it doesn't, see which version it works with. then we can help to investigate. Its a bit difficult to investigate since we don't have PowerPC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants