Advanced science.  Applied technology.

Search

Automated Network Traffic Based Protocol Reverse Engineering, 10-R6376

Principal Investigators
Szu-Li Lin
Diego Alducin
Inclusive Dates 
07/01/23 to 07/01/24

Background

Industrial control systems, automotive controller area networks, and satellite communications may rely on “security through obscurity” to protect their systems. “Security through obscurity” is the idea that security is reliant on secrecy and complexity of design which is insufficient in a modern world where both design and device communication often leaks, despite security measures. To properly secure a system, it is essential to observe network communications and understand the rules and patterns that are seen in everyday operations. This research uses network traffic between devices and applies data analytical techniques to automatously reverse engineer the packets with little or no documentation of how it works. Such a strategy defies the security through obscurity approach and encourages system owners to explore more complete cyber solutions.

Approach

To automatously reverse engineer the network communication, the team researched the limitations of protocol reverse engineering based on observable features in the network traffic. Following this limitation research, the team explored how to provide a partial understanding when these limitations inhibit a complete understanding of the message data. To do this research, the team explored industrial control system, automotive, and satellite communication networks. For each network type, the team will reverse engineer the unencrypted data through intelligent observation of the network traffic. For each set of network traffic, the team attempts to align the data payload bytes using an algorithm derived from bioinformatics to score and group similar messages. Following the alignment, the team used metadata associated with the collection, e.g., turning the windshield wipers on, to autonomously discover messages different from a baseline collection and analyze the differences in alignment to assert the meaning of specific bytes in the data payload.

Accomplishments

Validating database controller area network files

Figure 1: Validating database controller area network files.

This research proved that data alignment can successfully automate the discovery of fields in the modbus protocol structure without previous understanding of the protocol structure. It does this by grouping messages with similar data length and then aligning the bytes in those messages to infer the field boundaries. Furthermore, this research has proven successful for automotive networks and the team automatically parsed several data payloads. One application of this could be to validate database controller area network files as described by Figure 1. One limitation discovered has been that many data points in a vehicle change slightly and the interpreted data payload field boundaries are slightly off. For example, a car can go over 100 miles per hour, which may require over seven (7) bits, but the reverse engineering process may incorrectly interpret the speed values to be six (6) bits because the limited speed observed during testing. This implies that the data alignment reverse engineering scheme is best used when there is a need for a general understanding and being slightly off is not catastrophic. This research is ongoing, and the team is going to apply the reverse engineering strategy to satellite communications.