This is a write-up about detecting exploitation of the Log4Shell vulnerability ( CVE-2021–44228) in Log4j by monitoring specific syscalls using Falco. This post also describes the analysis I employed to arrive at my conclusions.
Note that this is not meant to be an end-all detection for Log4Shell but instead one of many that, as a whole, provide coverage across different points of visibility.
Use of Weapons
Before we begin, a quick description of the main tools I used here:
- Falco is a tool that “can detect and alert on any behavior that involves making system calls”. I think it’s one of, if not the best, open source threat detection tools one can have in Linux-land.
- Sysdig is a syscall capture tool and shares DNA with Falco. The filters and outputs written in Sysdig is fully compatible with Falco rules. (How convenient!)
High quality detection is possible (YMMV) by watching the Java process’ write I/O buffer on network sockets, looking for patterns that indicate outbound connections via JNDI.
This has several advantages:
- The detection is triggered on an established connection, so we can infer that the app is vulnerable, whether or not the subsequent RCE was successful.
- By detecting the initial connection, we are not dependent on the various ways an attack may be carried out afterwards.
- Assuming an organization’s reliance on Java JNDI (LDAP and/or RMI) is very minimal, there is high confidence that this detection will have a high true positive rate.
- macro: java_network_write
condition: (evt.type in (write, sendto) and evt.dir=< and fd.type in (ipv4, ipv6) and proc.name=java)- macro: jndi_ldap_indicator
condition: (evt.buffer contains "2.16.840")- macro: jndi_rmi_indicator
condition: (evt.buffer startswith "JRMI")- rule: Java Process JNDI Connection
desc: Potential exploitation of the log4shell Log4j vulnerability (CVE-2021-44228)
java_network_write and (jndi_ldap_indicator or jndi_rmi_indicator)
output: Java process JNDI connection (user=%user.name user_loginname=%user.loginname user_loginuid=%user.loginuid event=%evt.type connection=%fd.name server_ip=%fd.sip server_port=%fd.sport proto=%fd.l4proto process=%proc.name command=%proc.cmdline parent=%proc.pname buffer=%evt.buffer container_id=%container.id image=%container.image.repository)
This rule requires running Falco with the
-A flag, which turns on all syscall monitoring. To run Falco normally, remove
write from the
- macro: java_network_write
condition: (evt.type in sendto and evt.dir=< and fd.type in (ipv4, ipv6) and proc.name=java)
Quick note about byte value matching
An upcoming version of Falco (possibly 0.32.0) adds support for matching byte values expressed as hex strings (see PR). With this feature we can write more robust rules by matching on the first few bytes, instead on relying solely on ASCII patterns.
- This detection only covers LDAP and RMI connections.
- Java 17 uses the
write()syscall, but Falco ignores it by default. There is an option that enables monitoring of all syscalls (
-Aflag), but could impact performance.
- Because Falco captures only the first 80 bytes of the I/O buffer, this detection can be bypassed if the path of the callback URL exceeds 27 bytes. See later section for details and workarounds.
- Virtualbox VM with Ubuntu 18.04.6 LTS
- Sysdig 0.27.1
- Falco 0.30.0
- OpenJDK Runtime Environment (build 17.0.1+12-Ubuntu-118.04)
- OpenJDK Runtime Environment (build 11.0.13+8-Ubuntu-0ubuntu1.18.04)
- OpenJDK Runtime Environment (build 1.8.0_312–8u312-b07–0ubuntu1~18.04-b07)
- Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
- Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Vulnerable Log4j versions
- All versions from 2.0 up to 2.14.1
- PoC provided by d0nutptr, using marshalsec + python http server
- log4Jtest by AndrewMohawk
All tested exploits operate as illustrated below (credits to Fastly):
The vulnerable apps ran exclusively as a separate user (“poc”) to make it easier to trace processes.
I started with the vulnerable jar file and exploit server from d0nutptr’s PoC (log4j v2.14.1 running on Java 17). The exploit server is run as follows:
Note that the
jndi:ldap server is listening on port 12345, and the HTTP server (with the malicious Java class) is on port 12346. It is also good to note the code for the malicious Java class ("foo"):
The vulnerable app (jar file) is then run from a script:
The string “Pwned 8)” is printed which indicates that the exploit was successful.
While this was happening, Sysdig was running on another terminal window. To make sure I didn’t miss anything, I first cast a wide net, i.e., I ran it such that it logged most syscalls from the vulnerable app:
Looking at the resultant logs, four events stood out:
The initial connection to the callback URL
jndi:ldap://localhost:12345 (Phase 1 in the diagram), and the first few bytes that were sent by the app:
foo (path in callback URL),
2.16.840.1.1137184.108.40.206 look interesting... but we'll get back to that later.
The subsequent HTTP call to the Python web server that was hosting the malicious class (Phase 2 in the diagram):
This is showing the actual HTTP GET request made by the app to the web server.
The payload execution which wrote “Pwned 8)” to STDOUT:
A final call was made to the original jndi:ldap callback URL (port 12345) which also includes the
2.16.840.1.1137220.127.116.11 string seen during the initial connection. This was immediately followed by connection termination:
It is also important to note that this event only fires after the malicious class has been loaded and executed. If the class establishes a reverse shell, for example, then this event will not show up until after the reverse shell session has terminated.
This compares detection potential for the notable events from the previous section.
Given this, it made the most sense to obtain signals from Event 1, with Event 4 serving as fallback.
The most likely candidate for pattern match is the
2.16.840.1.113718.104.22.168 (full or partial) string. To further support this, it turns out that that number is the LDAP OID for ManageDsaIT Request Control.
Based on the prior findings, I updated the Sysdig command so that it captures only the relevant information:
sysdig -X "user.name=poc and evt.type in (write, sendto) \
and evt.dir=< and \
fd.type in (ipv4, ipv6) and proc.name=java" \
-p"%fd.name (%evt.type, %evt.buflen bytes) %evt.buffer"
Now is a good time to explain the syntax:
For a complete list of fields, see this doc from falco.org.
Notice that the
sendto() event was included here, too. This is explained in the next section.
Java 17 vs older versions
During the course of my testing, I learned that different Java versions use different syscalls when making JNDI connections. While Java 17 (the latest, default version in Ubuntu 18.04) makes
write() calls, Java 6, 7, 8 and 11 all use
Java 17 looks like this:
while on Java 6, 7, 8 and 11:
The data is identical except for the syscall itself.
Effect of callback URL path length
The data we’ve seen so far was from using the callback URL
localhost:12345/foo, and we can see the path in the Event 1 signal:
By default, Sysdig (and Falco) captures only the first 80 bytes of the I/O buffer. If the path is longer, we can see the subsequent bytes of data getting shifted back such that some end up beyond the capture boundary. For example, if the path is
/foo456789/123456/8901234567, we can't see the full
2.16.840.1.113722.214.171.124 pattern anymore:
In the case of the base64-encoded URLs such as described in this article, a path like
/Basic/Command/Base64/d2dldCBoeHhwOi8vMTI3LjAuMC4xL2xoLnNoO2NobW9kICt4IGxoLnNoOy4vbGguc2gK would be long enough to completely fill up the buffer:
At this point we would have lost whatever pattern we were matching for, and the detection is effectively bypassed.
The only way to work around this is to start Sysdig and Falco with a larger I/O buffer capture size. Here’s the data when the size is 240 bytes (3x the default):
We get our signal back!
Increasing the buffer size could potentially impact the performance of systems. If we need to stick to defaults, there is fortunately a fallback even if long paths bypass the Event 1 signal: the Event 4 signal.
As far as I can tell, this value is consistent and is independent of any property of the callback URL. However, it does come with its own caveats as described in the Event 4 section.
Also note that the upcoming byte matching feature of Falco could potentially eliminate this issue.
What about RMI?
Up to this point I have only covered
jndi:ldap connections. I did the same analysis for
jndi:rmi, and it turns out that the matched pattern is straightforward and consistent:
So, for RMI, I think it should be enough to simply watch the pattern “JRMI”.
Testing all affected Log4j versions
The last thing I wanted to do was to verify that the signals are consistent across different vulnerable versions of Log4j and Java. Thanks to the log4jtest PoC I was able to write a quick and dirty script that can iterate through all combinations of Java and Log4j versions.
I installed Java 8, 11 and 17 as OpenJDK from the Ubuntu repository, while Java 6 and 7 were downloaded directly from Oracle. All Log4j libraries were downloaded from Apache. I then compiled different versions of the app, each one using each of the Java versions.
Test results below.
expected: the expected pattern was observed
incompatible: Log4j version is not compatible with the Java version
backport: Log4j version is a backport update for its major version (i.e. not vulnerable)
The results show that the pattern is consistent across the board. At this point I was confident enough to develop the appropriate Falco rule.
- LunaSec Report
- RCE in log4j, Log4Shell, or how things can get bad quickly
- PSA: Log4Shell and the current state of JNDI injection
- New data and insights into Log4Shell attacks
- Falco (and Sysdig) Supported Fields
- Sysdig Article on Log4Shell
- LDAP OID Reference Guide / 2.16.840.1.1137126.96.36.199
Big thanks to Andrew MacPherson and Nathanial Lattimer for providing me with PoCs and guidance. Also to Ian Carroll for feedback and ideas.