Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault under filebench workloads #44

Open
leftgeek opened this issue Nov 23, 2020 · 5 comments
Open

Segmentation fault under filebench workloads #44

leftgeek opened this issue Nov 23, 2020 · 5 comments

Comments

@leftgeek
Copy link

Environment: Splitfs, Optane DC PM, Ubuntu 18.04 LTS, glibc 2.27, gcc 7.5.0

When I run the filebench workloads (varmail, fileserver, webserver, webproxy) using the scripts/filebench/run_fs.sh, it always gets Segmentation fault (core dumped).
Although varmail, fileserver and webproxy can still complete the tests and show results, the results are doubtful because the performance is significant lower than that of ext4-dax.
Webserver generally crash immediately...
It seems there is nothing to do with the workload data size...

I setup Splitfs exactly following the steps, I only change the NVP_NUM_LOCKS from 32 to 144 because my machine has 72 logical CPUs.

@rohankadekodi
Copy link
Collaborator

Hello,

Thanks for trying out SplitFS. I am not sure about what is causing the problem at your end. I have modified the filebench_run.sh script file to ensure that the correct environment variables are being set, and also modified the workload files to match them with the filebench repository's workload files (except for webserver and webproxy).

Could you pull and check again? Also, please make sure you do the following things:

  1. Run the filebench directory present in SplitFS repository
  2. Do not modify the filebench workload files (you can modify them later, but please do not modify them right now for debugging purposes)
  3. Run filebench using ./run_filebench.sh present in the scripts directory
  4. Do not make any changes to the common.mk Makefile in the splitfs/ source directory

If it runs correctly, you should see around 2-3x improvement in SplitFS as compared to ext4 DAX in varmail and fileserver. I am seeing the same on my end, when I tested the performance on Fedora 30 with the same workload files, with 96 logical CPUs.

@leftgeek
Copy link
Author

Thanks for your reply.
The problem still exists but I found something new:

  1. Splitfs did outperforms ext4-DAX 2-3x with Linux 4.13.
  2. The performance of ext4-dax on Linux 5.1 is much better than Linux 4.13 (especially with #threads >= 8).
  3. Filebench crashes directly when running on NOVA and PMFS.
    So the problem is probably caused by glibc and kernel.
    Have you tried upgrading splitfs to a higher Linux kernel version (5.x)?

@leftgeek
Copy link
Author

Sorry, it seems that the filebench core dump problem is caused by modifying NVP_NUM_LOCKS in nvp_lock.h.
Thus currently splitfs can only run on #0-15 CPUs.

@rohankadekodi
Copy link
Collaborator

Thank you for this insight. I will look into why the core dump problem is coming up if we run with more cores on 5.1. The problem does not seem to be fundamental to SplitFS. Just to confirm, is the performance of SplitFS better than ext4-DAX on Linux 5.1 when you run with cores 0-15?

I am not sure why Filebench crashes directly when running on NOVA and PMFS. We don't modify NOVA and PMFS, and might be an issue with your kernel.

@leftgeek
Copy link
Author

leftgeek commented Dec 1, 2020

I failed to port SplitFS to Linux5.1 because ext4 code has been changed in Linux 5.1
For vamail, webserver and fileserver, ext4-DAX-5.1 oputperms ext4-DAX-4.13 about 10~20% with threads >= 8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants