Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak/crash with v3/master #1411

Closed
mhalden opened this issue May 10, 2017 · 17 comments
Closed

Memory leak/crash with v3/master #1411

mhalden opened this issue May 10, 2017 · 17 comments
Assignees

Comments

@mhalden
Copy link

mhalden commented May 10, 2017

We are observing a memory leak with modsecurity v3 0e05b7b and 6421ff0 on FreeBSD 10.3 with nginx 1.12.0. After running for a few minutes it will allocate about 32GiB of memory (it seems to always allocate the same amount) and will shortly after use all the memory before it dies of SIGSEGV. From what we can gather it always dies evaluating the same rule, although the rule has been evaluated successfully earlier before this happens.

We have the following stack trace from the coredump.

(gdb) bt
#0  0x0000000801db4685 in memcpy () from /lib/libc.so.7
#1  0x0000000802d5c232 in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string () from /usr/lib/libc++.so.1
#2  0x0000000800e20e8c in modsecurity::Rule::getFinalVars (this=0x804740380, trans=<value optimized out>) at rule.cc:503
#3  0x0000000800e22818 in modsecurity::Rule::evaluate (this=0x804740380, trans=0x8063db000, ruleMessage={__ptr_ = 0x8040aad00, __cntrl_ = 0x80b7fa560}) at rule.cc:637
#4  0x0000000800e15e8f in modsecurity::Rules::evaluate (this=0x806fe6400, phase=<value optimized out>, transaction=0x8063db000) at rules.cc:212
#5  0x0000000800e07afc in modsecurity::Transaction::processRequestBody (this=<value optimized out>) at transaction.cc:799
#6  0x0000000800e0d3e9 in msc_process_request_body (transaction=0x81b9bde08) at transaction.cc:1792
#7  0x000000000049b72f in ngx_http_modsecurity_pre_access_handler ()
#8  0x0000000000449010 in ngx_http_core_generic_phase ()
#9  0x0000000000448f9d in ngx_http_handler ()
#10 0x0000000000452e99 in ngx_http_process_request ()
#11 0x000000000045445f in ngx_http_free_request ()
#12 0x000000000043d0c5 in ngx_freebsd_sendfile_chain ()
#13 0x0000000000433902 in ngx_process_events_and_timers ()
#14 0x000000000043bfcd in ngx_single_process_cycle ()
#15 0x0000000000439c9e in ngx_spawn_process ()
#16 0x000000000043af89 in ngx_master_process_cycle ()
#17 0x000000000043a703 in ngx_master_process_cycle ()
#18 0x0000000000413fd8 in main ()
@ltning
Copy link

ltning commented May 10, 2017

Some additional info:

The debug log (which also tells us the rule being evaluated) reads:
[4] (Rule: 949100) Executing operator "Eq" with param "1" against IP.
[6] Resolving: ip.reput_block_reason to: ModSecurity Core Rule Set is deployed without configuration! Please copy the crs-setup.conf.example template to crs-setup.conf, and include the crs-setup.conf file in your webserver configuration before including the CRS rules. See the INSTALL file in the CRS directory for detailed instructions.'

Clearly, there are (possibly several) things going wrong here: This msg does not pertain to that particular rule at all. We have obviously deployed with configuration, otherwise this message would have appeared for every request. As @mhalden mentioned, the same rule is hit every time the situation occurs. Other rules (including (false) positives) hit all the time, so in general, rule processing seems to work.

We're concatenating modsecurity.conf, crs-setup.conf and rules/*.conf (minus PHP, IIS, WORDPRESS and DRUPAL) into our local config, which is used globally in the nginx configuration.

I'll also refer to issue #1406 (empty audit logs) - this is on the same build and the same systems.

@ltning
Copy link

ltning commented May 10, 2017

Another tidbit: We suspect that the same issue has occurred in 2.9.1; our first attempt at using ModSecurity was nginx with 2.9.1 as that is what the FreeBSD port of nginx originally supported. Soon after building and installing it, we saw nginx bombing out like described above (working fine for a little while, then upon certain requests - not sure which - started growing wildly).

We did not dig further into this as we realised v3 is the future, and embarked upon making an updated nginx port and a standalone modsecurity port to support this.

@ltning
Copy link

ltning commented May 10, 2017

From ktrace output (time stamps are relative to the previous entry):

50170 nginx    0.000008 RET   write 50/0x32
 50170 nginx    0.000014 CALL  write(0x6,0x80bbc6000,0x46)
 50170 nginx    0.000017 GIO   fd 6 wrote 70 bytes
       "[4] (Rule: 949100) Executing operator "Eq" with param "1" against IP.
       "
 50170 nginx    0.000008 RET   write 70/0x46
 50170 nginx    0.000048 CALL  write(0x6,0x80bbc6000,0x153)
 50170 nginx    0.000018 GIO   fd 6 wrote 339 bytes
       "[6] Resolving: ip.reput_block_reason to: ModSecurity Core Rule Set is deployed without configuration! Please copy the crs-setup.conf.example template to crs-setup.conf, and include the crs-setup.conf file in your webserver configuration before including the CRS rules. See the INSTALL file in the CRS directory\
         for detailed instructions.'
       "
 50170 nginx    0.000008 RET   write 339/0x153
 50170 nginx    0.000063 CALL  mmap(0,0x804400000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xffffffff,0)
 50170 nginx    0.000016 RET   mmap 34596716544/0x80e200000
 50170 nginx    0.000008 CALL  munmap(0x80e200000,0x804400000)
 50170 nginx    0.000013 RET   munmap 0
 50170 nginx    0.000008 CALL  mmap(0,0x8047ff000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xffffffff,0)
 50170 nginx    0.000009 RET   mmap 34596716544/0x80e200000
 50170 nginx    0.000008 CALL  munmap(0x80e200000,0x200000)
 50170 nginx    0.000010 RET   munmap 0
 50170 nginx    0.000008 CALL  munmap(0x1012800000,0x1ff000)
 50170 nginx    0.000009 RET   munmap 0
 50170 nginx    29.564780 PSIG  SIGSEGV SIG_DFL code=SEGV_MAPERR
 50170 nginx    0.000006 NAMI  "/tmp/cores/nginx.core"

@zimmerle
Copy link
Contributor

Hi @ltning,

Thank you for your detailed report. I've just performed a commit where I've removed the LMDB as part of the default configuration options [6143eb9]. Compiling the code without the LMDB support should be safe.

@zimmerle
Copy link
Contributor

Please let me know the results while running without LMDB support.

@zimmerle zimmerle self-assigned this May 10, 2017
@ltning
Copy link

ltning commented May 11, 2017

There haven't been any more core dumps since we updated, I believe. The "typical" traffic should start hitting right about now, so we'll know for certain in a few hours - but statistically it should have dumped at least 30 times during the night alone. Thanks!

Now if we'd just have those pesky audit logs written to...

@mhalden
Copy link
Author

mhalden commented May 11, 2017

So far LMDB support seems to be irrelevant for the crash as it dies in the same way as before even without LMDB. But when building with CFLAGS and CXXFLAGS set to -g -O0 we have so far been unable to reproduce the issue, both with 6143eb9 and 0e05b7b. ModSecurity is built with clang 3.4.1.

@ltning
Copy link

ltning commented May 11, 2017

So - it looks like clang 3.4.1 causes nginx to crash and burn with -O1 or higher. GCC 6.3 and clang 3.8.1 both produce binaries that seem to work fine even with the default -O3.

Since FreeBSD 11 and above ships with clang 3.8, and 10.3 can use clang 3.8 from ports, the solution is to have the Makefile for the port pull in clang 3.8 if the OS version is <11.

So, again we're back to the missing audit logs before ModSecurity becomes useful :)

@ltning
Copy link

ltning commented May 11, 2017

Aaaaand...scratch the part about clang. Even when built with 3.8.1 it crashes. We're now back to the GCC build, which we'll test more extensively.

@Sp1l
Copy link

Sp1l commented May 3, 2018

Any updates on this? Improvements with clang 4 (FreeBSD 11.1)? Fixes in the ModSecurity code?

If this is not fixed, can this be re-opened or a new Issue created?

@ltning
Copy link

ltning commented May 3, 2018

The current port for nginx and mod_security3 work, but there are significant memory leaks on systems with heavy traffic. We're using cron to call 'nginx -s reload' every few hours to circumvent this, but we have not been able to determine to cause for the leaks yet. Using CRS 3.0 ruleset with some whitelisting.

@zimmerle
Copy link
Contributor

zimmerle commented May 3, 2018

right, so since there isn't a crash any longer and the leak apparently is not the same, better to create a new issue just to keep the issue easy to read. @ltning can you create a new issue to summarize the leak that you are facing? make sure to mention libModSecurity version. Also, make sure that LMDB is not enabled. Thank you!

@Neko-Chang-Taiwan
Copy link

Hi All

Can help to confirm the issue still exist or not possible?

History...
I got segment fault in Apache 2.4 @ FreeBSD as below.
owasp-modsecurity/ModSecurity-apache#59
In FreeBSD, build by GCC ("USE_GCC= yes") and error occur from GCC found in GDB result.

So I tried to remove it and rebuild use FreeBSD build-in LLVM/Clang 10
Apache 2.4 start successful and worked.
In the time, minimize configuration.

I realize GCC for this issue.

Thanks a lot.

@zimmerle
Copy link
Contributor

@Neko-Chang-Taiwan apache connector for ModSecurity v3 is still not ready.

@Neko-Chang-Taiwan
Copy link

Neko-Chang-Taiwan commented Feb 11, 2021

@Neko-Chang-Taiwan apache connector for ModSecurity v3 is still not ready.

Hi @zimmerle

Because log as below
[Wed Feb 10 19:36:53.893064 2021] [:notice] [pid 15168:tid 34370637824] ModSecurity: ModSecurity-Apache v0.1.1-beta configured.
And segment fault disappeared @ modsecurity3.
Workable I think 😅

@zimmerle
Copy link
Contributor

It is not ready for production yet. For Apache, we suggest our users to keep using the stable version: 2.9.3

@Neko-Chang-Taiwan
Copy link

It is not ready for production yet. For Apache, we suggest our users to keep using the stable version: 2.9.3

Hi @zimmerle

Thanks your replied.
Please confirm the issue by latest version of LLVM/CLANG if possible when it ready for production. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants