On Apr 8, 2025, at 3:59 AM, Sumanth Gopalagowda wrote:
Dear Professor Liao,
I hope this message finds you well.
My name is Sumanth Gopalagowda, ...
As part of my academic project, I am working on computing various deformation measures from snapshots of atomistic simulations.
After computing these measures, I am attempting to write the resulting data to disk in parallel using the PnetCDF-Python interface. However, I am facing performance issues during the I/O stage on our HPC cluster. For instance, writing approximately 72 GB of data takes around 269 seconds on a single node, but the time increases to about 337 seconds when using two nodes. This I/O bottleneck is negating the computational speed-up achieved through parallelization.
I have attached the relevant portion of my code that performs the file writing. I would be extremely grateful if you could take a moment to review it and provide any suggestions or insights on how I might improve its scalability and performance.
Thank you very much for your time and consideration.
Best regards,
Sumanth Gopalagowda
write_out.py.txt
Please let us know what file system you are using (is it a parallel file system?).
If you can edit your short program to add the main function so we can reproduce
the performance number, that will be helpful.
write_out.py.txt
Please let us know what file system you are using (is it a parallel file system?).
If you can edit your short program to add the main function so we can reproduce
the performance number, that will be helpful.