Provide a detailed summary of the following web content, including what type of content it is (e.g. news article, essay, technical report, blog post, product documentation, content marketing, etc). If the content looks like an error message, respond 'content unavailable'. If there is anything controversial please highlight the controversy. If there is something surprising, unique, or clever, please highlight that as well: Title: Fun with macOS's Sip Site: metalbear.co While developing mirrord, which heavily relies on injecting itself into other people’s binaries, we ran into some challenges posed by macOS’s SIP (System Integrity Protection). This post details how we ultimately overcame these challenges, and we hope it can be of help to other people hoping to learn about SIP, as we’ve learned the hard way that there’s very little written about this subject on the internet. What is mirrord? # mirrord lets you run a local process in the context of a cloud service, which means we can test our code on our staging cluster without actually deploying it there. This leads to shorter feedback loops (you don’t have to wait on long CI processes to test your code in staging conditions) and a more stable staging environment (since untested services aren’t being deployed there). There is a detailed overview of mirrord in this blog post. What is SIP and why does mirrord care about it? # Apple introduced SIP in 2015 to prevent tampering with system binaries because those usually have high permissions and entitlements. mirrord works by injecting its library (.dylib or .so) into the local process, the one you want to “mirror” into the cloud. In order to inject itself into binaries, it uses LD_PRELOAD on Linux and DYLD_INSERT_LIBRARIES on macOS. One of the features of SIP is that it disallows the use of DYLD_INSERT_LIBRARIES with protected binaries. Bummer - we can’t use mirrord to locally run e.g. bash or ls in a cloud context. we can’t run mirrord exec bash or mirrord exec ls . We’ll have to find a way to get around SIP! Detecting if a binary is SIP-protected # In order to start bypassing SIP, we needed to find a way to check if a binary is even SIP protected to begin with. We initially used pretty coarse heuristic, assuming that a binary is SIP-protected if it’s in one of these locations: However, we found that the “stat” function can return a flag called RESTRICTED which we read could be related, and decided to use that instead . The code is quite simple: let metadata = std::fs::metadata(&complete_path)?; if (metadata.st_flags() & SF_RESTRICTED) > 0 { return Ok(SipStatus::SomeSIP(complete_path, None)); } Shebang! # Detection was simple enough when mirrord was run directly on a SIP-protected binary. However, we soon ran into a less trivial (but common) scenario - an interpreter script that starts with a shebang pointing to a SIP-protected binary, for example yarn / npm / pyenv . If we look at npm for instance, it points to a file which starts with the following code: #!/usr/bin/env node require('../lib/cli.js')(process) In this example it will execute env, which will execute node with the next line. The problem? /usr/bin/env is SIP protected, meaning it will strip our DYLD_INSERT_LIBRARIES then run node without mirrord. Thanks for nothing SIP! So we also needed to check whether the file is a “shebang” file (i.e starts with #!), what file the shebang points to, and whether that file is a SIP-protected binary. Bypassing SIP # Now that we found a way to detect whether we’re being run on a SIP-protected binary, we need to figure out how to bypass SIP and let mirrord load into the binary with DYLD_INSERT_LIBRARIES . When googling around, we found people saying you can bypass SIP by copying the binary to another directory and re-signing it. We found that to be partially true. Why partially? Because if you tried to do it on Apple Silicon (arm), it wouldn’t work. This is because beginning with M1, macOS ships with arm64e binaries. The e indicates an arm64 extension that adds pointer authentication. It’s another security measurement added by Apple (kudos to Apple for having great security, too bad it affects us). We won’t go into details about what pointer authentication does, but you can read more about it here . So why was this a problem? First, mirrord is written in Rust, which doesn’t support compiling arm64e binaries. The other problem is that only Apple-signed binaries can run with arm64e architecture. This is what happens if we try the “old” trick: ➜ /tmp cp /usr/bin/env /tmp/env ➜ /tmp codesign -f -s - /tmp/env /tmp/env: replacing existing signature ➜ /tmp /tmp/env [1] 20114 killed /tmp/env Killed! And it was so young. :( Recording using Console (macOS’s built in log viewer) while running the binary reveals the reason: From Apple’s point of view, arm64e is preview only, i.e the ABI can change and they don’t want a third party building on top of it, as it’s not guaranteed to work. You can enable running third party executables with arm64e ABI only if you boot into recovery mode and change the settings, which is not something we want to ask our users to do. Handling arm64e # Initially we tried to convert the arm64e ABI into arm64 on the fly. Yes, people familiar with how this ABI works probably think we’re insane, but we were optimistic.. and it actually worked! for example, if you take /usr/bin/env and just change the file headers to say it’s arm64, you’d be able to re-sign it and run it normally! We actually do it for our binaries to be able to load to arm64e binaries: # from our release.yaml https://github.com/metalbear-co/mirrord/blob/main/.github/workflows/release.yaml - name: build mirrord-layer macOS arm/arm64e # Editing the arm64 binary, since arm64e can be loaded into both arm64 & arm64e # >> target/debug/libmirrord_layer.dylib: Mach-O 64-bit dynamically linked shared library arm64 # >> magic bits: 0000000 facf feed 000c 0100 0000 0000 0006 0000 # >> After editing using dd - # >> magic bits: 0000000 facf feed 000c 0100 0002 0000 0006 0000 # >> target/debug/libmirrord_layer.dylib: Mach-O 64-bit dynamically linked shared library arm64e run: | cargo +nightly build --release -p mirrord-layer --target=aarch64-apple-darwin cp target/aarch64-apple-darwin/release/libmirrord_layer.dylib target/aarch64-apple-darwin/release/libmirrord_layer_arm64e.dylib printf '\x02' | dd of=target/aarch64-apple-darwin/release/libmirrord_layer_arm64e.dylib bs=1 seek=8 count=1 conv=notrunc It didn’t work for all binaries though ( ls for example) and when we started digging we found out that there are a lot of new features being used in arm64e, for example specific relocations that contain pointer authentication stuff. We decided to give up on ABI conversion for the time being. Luckily, Apple ships fat binaries on both architecture machines. Fat binaries are Apple’s name for Mach-O files containing two different binaries, each built for a different architecture, so the runtime can decide which one it will use. By default, it will choose arm64e, but we can do something nice with the x64 binary. /bin/ls: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64e:Mach-O 64-bit executable arm64e] /bin/ls (for architecture x86_64): Mach-O 64-bit executable x86_64 /bin/ls (for architecture arm64e): Mach-O 64-bit executable arm64e The idea is that we take the binary we want to load ourself into, extract only the x64 binary (on arm), re-sign it, and run it. The only downside here is that we require Rosetta to be installed on the system and there’s a performance impact - but usually system binaries are used for simple operations like env or bash (see the shebang case). Putting it all together # Detect SIP (Shebang/Restricted) Patch a. Extract x64 binary into a new file b. chmod +x it c. Sign it Execute /// Read the contents (or just the x86_64 section in case of a fat file) from the SIP binary at /// `path`, write it into `output`, give it the same permissions, and sign the new binary. fn patch_binary, K: AsRef>(path: P, output: K) -> Result<()> { let data = std::fs::read(path.as_ref())?; let binary_info = BinaryInfo::from_object_bytes(&data)?; let x64_binary = &data[binary_info.offset..binary_info.offset + binary_info.size]; std::fs::write(output.as_ref(), x64_binary)?; // Give the new file the same permissions as the old file. std::fs::set_permissions( output.as_ref(), std::fs::metadata(path.as_ref())?.permissions(), )?; codesign::sign(output) } Integration into mirrord took two steps: When using mirrord directly on a SIP-protected binary, do the patch When using mirrord on a process, and that process executes a SIP-protected binary, replace it on the fly. This was done by having mirrord hook execve in the process execve hook: /// Hook for `libc::execve`. /// /// Patch file if it is SIPed, use new path if patched. /// If any args in argv are paths to mirrord's temp directory, strip the temp dir part. /// So if argv[1] is "/var/folders/1337/mirrord-bin/opt/homebrew/bin/npx" /// Switch it to "/opt/homebrew/bin/npx" /// then call normal execve with the possibly updated path and argv and the original envp. #[hook_guard_fn] pub(crate) unsafe extern "C" fn execve_detour( path: *const c_char, argv: *const *const c_char, envp: *const *const c_char, ) -> c_int { // Do unsafe part of path conversion here. let rawish_path = (!path.is_null()).then(|| CStr::from_ptr(path)); let mut patched_path = CString::default(); let final_path = patch_if_sip(rawish_path) .and_then(|s| match CString::new(s) { Ok(c_string) => { pat