Binja Swift Name Demangling Plugin
I’ve recently started examining Swift-compiled binaries and quickly encountered the use of name mangling. This prompted me to explore how one can demangle these names to simplify reversing the binaries. I ended up creating a Binary Ninja plugin that uses the built-in ‘swift-demangle’ tool to rename symbols to their demangled version. In this post, we’ll cover Swift name mangling (with examples of how to manually demangle it), how the plugin leverages the built-in tool to demangle names, and how this tool can further improve the decompilation of Swift binaries.
Name Mangling in Swift
To provide some context for the problem we’re trying to solve, let’s first understand what name mangling is, why it’s used, and how Swift implements it.
Name mangling (or name decoration) as defined by Wikipedia is:
“a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.”
The need for this technique arises when we’re dealing with languages that support features like function overloading and namespaces, or any feature that allows different entities to share the same symbol. During the build process, the linker must determine which entity is intended when symbols collide.
Consider the following Swift example (from here) where function overloading is used:
// function with Int type parameter
func displayValue(value: Int) {
print("Integer value:", value)
}
// function with String type parameter
func displayValue(value: String) {
print("String value:", value)
}
// function call with String type parameter
displayValue(value: "Swift")
// function call with Int type parameter
displayValue(value: 2)
When the previous code is built, the two displayValue
functions must be differentiated by the compiler when converting the code into binary format.
This is achieved through name mangling, which creates unique identifiers for each function.
We can observe the mangled names by calling swiftc
with the emit-sil
flag, which will output the Swift Intermediate Language (SIL) file:
$ swiftc -emit-sil Sources/main.swift
... (trimmed) ...
// displayValue(value:)
sil hidden @$s4main12displayValue5valueySS_tF : $@convention(thin) (@guaranteed String) -> () {
... (trimmed) ...
// displayValue(value:)
sil hidden @$s4main12displayValue5valueySi_tF : $@convention(thin) (Int) -> () {
... (trimmed) ...
From here we can note that the two functions became:
@$s4main12displayValue5valueySS_tF
@$s4main12displayValue5valueySi_tF
The only difference is the usage of S
in one and i
in the other.
We can infer that this corresponds to one function using a String and the other using an Integer, but let’s understand the mangling a little better.
We’ll use the Swift documentation about name mangling, available here, to decode the names, and use the built-in demangler tool, available via swift demangle
, to understand the mangling.
If we start by using swift demangle
to convert back the names, we can see the following:
$ swift demangle
@$s4main12displayValue5valueySS_tF
@main.displayValue(value: Swift.String) -> ()
@$s4main12displayValue5valueySi_tF
@main.displayValue(value: Swift.Int) -> ()
Let’s do the same process manually to better understand what is happening.
The initial @
identifies the entry point scope, so we can safely ignore it.
As the documentation explains, the operators are structured in postfix order.
We’ll start from the end of the name, which is the F
that tells us we’re dealing with a function:
entity-spec ::= decl-name label-list function-signature generic-signature? 'F' // function
If we look through the documentation, we can search all the definitions that we need to use, based on knowing the final function definition shown above. To make it easier to read, the following shows all those definitions (some optional definitions were already removed), which we’ll use after:
mangled-name ::= '$s' global
global ::= entity
entity ::= context entity-spec
context ::= module
module ::= identifier
identifier ::= NATURAL IDENTIFIER-STRING
IDENTIFIER-STRING ::= IDENTIFIER-START-CHAR IDENTIFIER-CHAR*
IDENTIFIER-START-CHAR ::= [_a-zA-Z]
IDENTIFIER-CHAR ::= [_$a-zA-Z0-9]
entity-spec ::= decl-name label-list function-signature 'F'
function-signature ::= params-type params-type
params-type ::= type
params-type ::= empty-list
decl-name ::= identifier
label-list ::= ('_' | identifier)*
empty-list ::= 'y'
type ::= any-generic-type
type ::= type-list 't'
type-list ::= list-type '_' list-type*
list-type ::= type identifier?
any-generic-type ::= standard-substitutions
standard-substitutions ::= 'S' KNOWN-TYPE-KIND
KNOWN-TYPE-KIND ::= 'i' // Swift.Int
KNOWN-TYPE-KIND ::= 'S' // Swift.String
With all of the previous, we can split $s4main12displayValue5valueySS_tF
into its parts:
$s
is the global mangled name4main
is the contextmain
12displayValue
is thedecl-name
(function name)5value
is thelabel-list
(argument names)y
is theparams-type
for the return value, is which an empty list (no return)SS_t
os theparams-type
for the arguments, which is a tuple with the arguments, in this caseSwift.String
At this point we have a better understanding on how the mangling works, and we can see how it can be very useful to demangle to get relevant information into a compiled binary.
Binary Ninja Plugin
Based on the previous knowledge, we can build a plugin that takes advantage of the already existing swift-demangle
binary to give more context to a Swift binary that we are analysing.
The plugin is very straightforward and allows to better understand some of the primitives used in Swift that Binary Ninja still doesn’t correctly identify (e.g. string concatenation).
The plugin is available here and the flow of it is the following:
- Ask the user for the
swift-demangle
file location - Get all the symbols in the binary
- Send the symbols to
swift-demangle
- Iterate all symbols:
- If the symbol changed (
swift-demangle
echos the symbol back if it can’t demangle)- If the symbol was not changed in Binary Ninja, change it to the demangled version
- Else, add a comment with the demangled version
- Else, continue
- If the symbol changed (
The following picture shows a simple code snippet before running the script:
And after running the script:
Future Work
Although the script is super simple, it provides an interesting starting point for other improvements, specially when it comes to identifying types and providing type propagation when analysing a binary.
The swift-demangle
tool has the option to output a tree of the demangling, which looks like the following:
$ swift demangle --tree-only
@$s4main12displayValue5valueySi_tF
@Demangling for $s4main12displayValue5valueySi_tF
kind=Global
kind=Function
kind=Module, text="main"
kind=Identifier, text="displayValue"
kind=LabelList
kind=Identifier, text="value"
kind=Type
kind=FunctionType
kind=ArgumentTuple
kind=Type
kind=Tuple
kind=TupleElement
kind=Type
kind=Structure
kind=Module, text="Swift"
kind=Identifier, text="Int"
kind=ReturnType
kind=Type
kind=Tuple
This provides very helpful information that can be fed into Binary Ninja to help define the function prototype, and consequently improve the readibility of Swift compiled binaries.