-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge GC pressure when downloading a file over http (but not https) #345
Comments
Can you control the buffer size? It seems this is just spinning the cpu.... I guess if the buffer size is small this would be the symptom. |
You cannot control the buffer size in cohttp. I'm in the process of fixing this problem but for the server side. Client side will follow afterwards. I'll have a look if it's possible to have a quick fix for this until then but I wouldn't count on it. |
Thx. Out of curiosity what is/was the root cause? |
Sure. Your comment actually made me take a look now and I think this is actually fixable relatively easily now! If we follow along the code path for fetching a response we get something like:
Connecting the 2, 3, and 4 together we realize that we are trying to allocate a rather hefty string! (That will be immediately thrown away after being read instead of being reused, but that's a different problem already). If you're interested, a fix for this would be appreciated. If not, then I'll try and get it done next week. @artemkin already fixed a similar problem with async on issue #330 so you can have a look there. |
@marklrh if you want, you can try tackling this as well |
@rgrinberg, it appears the fix resolves my issue completely. On testing from localhost I was able to get 400+ megabytes per second even with standard GC settings, which I think is completely acceptable :). Even if it at that point it was CPU bound. (But netcat wasn't far off at 65%.) Thanks! |
@eras NP. This is a pretty significant issue so I will make a release for it as well (0.17.2) |
I considered piggy-backing issue #207, but as it was decidedly about server side, I decided to make up a new one.
Downloading a file over HTTP second becomes a CPU-bound operation on a three-year old PC, limited to 800 kilobytes/second after some GC-tuning (the same internet resource can be loaded at 1.4M/s with curl). I have the source code reproducing the issue here: https://www.modeemi.fi/~flux/software/ocaml/downloader/ (this is perhaps easier to test that server-side operation).
The application is able download a separate HTTPS internet resource at 4.3M/s consuming 17% CPU. Before GC-tuning also HTTPS had a similar issue. If I increase Gc.minor_heap_size to (512 * (1 lsl 20)) the download does get faster, as well as the CPU%, but after a few seconds in the CPU% gets back to 100% and the virtual size of the process is now 12 gigabytes and resident creeps up at least towards 1 gigabyte during the transfer. (I tried a few Gc values in-between without better results, in particular the Downloader uses value 32 lsl 20.)
Here's what perf top says:
In this case I would eventually want to save downloaded content to a file.
The text was updated successfully, but these errors were encountered: