USB Reverse Engineering: Down the rabbit hole

Thanks for the featured writeup Hackaday! Make sure to check out the comments over there as well.

Looks like Hackernoon picked it up as well, make sure to check in with the comments there too.

It would be great if you could also head over to Hacker News, give an upvote, and join in the comments there. Let's get this information out there!

I tend to dive down rabbit holes a lot, and given the cost of context switching and memory deteriorating over time, sometimes the state I build up in my mind gets lost between the chances I get to dive in. These 'linkdump' posts are an attempt to collate at least some of that state in a way that I can hopefully restore to my brain at a later point.

This time around I was inspired to look into USB reverse engineering, protocol analyis, hardware hacking, and what would be involved in implementing custom drivers for arbitrary hardware. Or put another way: how do I hack all of the USBs?!??

It seems the deeper I went, the more interesting I found the content, and this post grew and grew. Hopefully it will help to shortcut your own journey down this path, and enlighten you to a whole new area of interesting things to hack!

Overview

tl;dr

This is long, has many sections, and time is precious:

Intro to USB

USB (universal serial bus) is an industry standard covering cables, connectors and protocols; and is pretty ubiquitous among tech products these days. I won't get deep on describing all of the facts, since that's what Wikipedia is good at:

That said, it will be useful to understand some of the aspects of how USB devices and protocols are laid out, and some of the terminology used.

A USB system (see also) has:

  • A host, with one or more downstream ports, and multiple peripherals
  • Hubs may be included, allowing up to 5 tiers
  • A host may have multiple controllers, each with one or more ports
  • Up to 127 devices can be connected to a single host controller
  • A device may have several logical sub-devices, referred to as 'device functions'
  • A composite device may provide multiple functions (eg. webcam + microphone)
  • A compound device connects logical devices to a built in hub

Digging into the protocol/communication side of things:

  • Communication is based on pipes (logical channels), between the host and an endpoint (logical entity) on a device
  • A device can have up to 32 endpoints (16 IN, 16 OUT)
  • Endpoints are defined and numbered during initialization, so tend to remain fairly permanent, whereas a pipe may be opened/closed
  • Two types of pipe: stream and message
  • Message pipes are bi-directional, used for control transfers short, simple commands + status response
  • Stream pipes are uni-directional, transfers data in isochronous, interrupt or bulk transfer
  • A set of endpoints with associated metadata is also known as an interface, each is associated with a single device function
  • All USB devices have at least one endpoint (0), default, used for control transfers. Descriptors sent on default pipe can describe other endpoints.
  • Descriptors form a hierarchy that you can view with tools like lsusb.
  • Device descriptor: contains information like device Vendor ID (VID) and Product ID (PID)

There are different transport types that can be used:

  • Interrupt transfers are for short periodic real-time data exchanges.
  • Isochronous transfers are somewhat similar but less strict; they allow for larger data blocks and are used by web cameras and similar devices, where delays or even losses of a single frame are not crucial.
  • Bulk transfers are for large amounts of data.
  • Control transfer type is the only one that has a standardised request (and response) format, and is used to manage devices

Further reading:

USB Reverse Engineering: An Introduction

Now, I could probably go through and write a whole blog post on this.. but, other people have done it for me! The following walks through an introduction to interfacing with, reverse engineering, understanding, and ultimately implementing software to drive a USB remote control car.

I found it quite easy to consume, and doesn't really assume much in the way of prior knowledge.

One of the tools used above was lsusb: "a utility for displaying information about USB buses in the system and the devices connected to them". Among other things, this allows the vendor and product ID of the device to be identified. Once identified, this tag can be used to query further information about the device, eg. lsusb -vd 0a81:0702.

Other relevant tools/concepts used include:

  • usbmon: a facility in kernel which is
    used to collect traces of I/O on the USB bus
  • Wireshark USB Capture
  • PyUSB : USB access for Python
  • libusb : A cross-platform library to access USB devices

USB Reverse Engineering: Further Reading

The following are some additional relatively short reads on how others have approached reverse engineering some devices, including tools they used, and basic methodologies.

I would definitely suggest checking this one out first:

By this stage you're probably not going to pick up masses of new information, but here are the rest for completeness, just in case:

Some common tools/methods used in the above articles include:

The basic process seems to be:

  • Setup to capture the device
  • Identify the Vendor ID and Product ID
  • Determine the device descriptors / endpoints
  • Capture USB traffic / attempt to decode commands
  • Make a driver / program to interact
  • Potentially fuzz for other commands (generally safer to do read only)

Another method of reverse engineering could be to reverse the device driver itself, and understand the functionality/features from that. This takes a more 'traditional' software reverse engineering approach to solving the problem.

If you want to be completely thorough, a hybrid approach may make the most sense (eg. analyse the traffic on from the device itself, then use the existing driver to help understand the data being sent back/forth and/or confirm you have captured all of the features)

Software: Wireshark, usbmon, USBPcap, VirtualBox, etc

So as we learned in the above articles, there are a number of 'software only' methods we can use to capture/inspect USB traffic, with the main modern methods being:

It is also possible to 'pass through' USB devices with your favourite virtual machine software (VMware, Parallels, Virtualbox, KVM, QEMU, etc) to assist in capturing data, though I will leave that as an exercise to the reader to look up the specifics (some references are in the above walkthroughs).

There are also some older programs and methods that might still work but probably aren't ideal anymore, including:

Hardware: tl;dr

Too many choices? Don't want to read through them all? A good bet is probably:

Hardware: BeagleBoard-XM / USBSniffer (~2010-2013, ~$149+)

Based on a 2010 GSoC BeagleBoard USB Sniffer, this is an updated version of a BeagleBoard-XM based USB sniffer. It acts as a man-in-the-middle hardware proxy allowing USB traffic to be captured, and later viewed in Wireshark or similar.

Hardware: OpenVizsla (~2010-2014)

(You probably just want to look at daisho below)

OpenVizsla is a Open Hardware FPGA-based USB analyzer. Unlike other similar devices on the market, hardware design files are available as well as full source code for the firmware and client software of the device.

This was a Kickstarter Project to create an "Open Hardware FPGA-based USB analyzer" targeting USB 2.0 High-Speed. There seems to be a lot of mixed opinions/views about this project on the internet/forums calling scam and similar. It sounds like there were a lot of delays and other issues.

According to this blog post, it sounds like they eventually got something working (years later) under the moniker 'OV3'. There seem to be a number of related posts on this blog under the tag 'OpenVizsla':

You should be able to find the latest news and code on the following website/GitHub pages:

Hardware: SerialUSB / GIMX USB Adapter (~2015, ~US$5-35)

A cheap USB proxy for input devices.

SerialUSB is at the low end of hardware capture devices, designed to be a low cost solution to assist in adding support for USB gaming peripheral protocols to the GIMX project.

For most purposes we probably won't need hardware for things at this level.. the software-based capture devices are likely good enough. But who knows.. maybe there are other uses for super cheap hardware capture..

Hardware: GoodFET (~2009-2018+, ~US$50)

(Before I dive in too deeply.. if you want the latest/greatest in this space, check out the GreatFET.)

The GoodFET is an open-source JTAG adapter, loosely based upon the TI MSP430 FET UIF and EZ430U boards, as described in their documentation. In addition to JTAG, the GoodFET has been inspired by HackADay's Bus Pirate to become a universal serial bus interface.

It "is a nifty little tool for quickly exposing embedded system buses to userland Python code.". Based on the bits and pieces I can pull together, I believe this will allow us to do our typical hardware based sniffing/dumping/etc, but I would have to find a better walkthrough/try it myself before being able to say that for certain.

Now one thing about this project that tends to confuse me is the versions/revision naming.. for example here are a number of the older revisions and their names:

As best I can tell.. there seem to be multiple parallel hardware versions at certain times.. based on different chipsets. And those versions may fork/merge at later times. Attempting to follow that logic.. the two most current (non-retired) revisions seem to be:

You should probably just spend time browsing around this site in general.. there are so many interesting sounding open-hardware designs.

You can order the boards (or request a free one!) from:

Further reading:

Hardware: Facedancer, Beagledancer, Raspdancer (~2012-2018+, ~US$85-???)

(Make sure to look at the facedancer 2.0 below as well)

The Facedancer21 is the twenty-fourth hardware revision of the GoodFET, owing its heritage to the GoodFET41 and Facedancer20. Unlike the general-purpose GoodFET boards, the only purpose of this board is to allow USB devices to be written in host-side Python, so that one workstation can fuzz-test the USB device drivers of another host.

The facedancer is less about capturing data, and more about emulating a USB device with software (python to be exact!). One reason for wanting to do this might be to fuzz the devices drivers on a host system, though I'm sure there could be a number of other creative uses too.. Maybe you want to allow one hardware device to masquerade as another and talk to it's drivers..

The following articles are a good read:

The Facedancer hardware extends the GoodFET framework to allow for fast prototyping and fuzzing of USB device drivers. Software connect/disconnect allows the enumeration process to be repeated, and Ryan's fork allows for clean coding of the various data structures with Scapy.

You can find out more about the facedancer boards at:

You can order the board (or request a free one!) from:

Other hardware projects that connect with the facedancer:

Hardware: Beaglebone Black + USBProxy (~2013?)

(This has been superceded by the facedancer 2.0 below)

A proxy for USB devices, libUSB and gadgetFS. A USB man in the middle device using embedded Linux devices with on the go controllers.

Presentations/etc:

Hardware: Daisho (~2013-?2018+?)

SuperSpeed USB 3.0 FPGA platform

This is a project designed for monitoring a number of high speed communication technologies at the physical layer, including USB 3.0, Gigabit ethernet, HDMI, etc. You can read more about it in the introduction blog:

You can find more about the project at the following sites:

Presentations/etc:

Hardware: GreatFET (~2015-2018+)

GreatFET is a next generation GoodFET intended to serve as your custom Hi-Speed USB peripheral through the addition of expansion boards called "neighbors".

Better GoodFET hardware, cheaper. Sounds great to me. According to the main site this is still at a 'functional prototype' stage though:

Functional prototype hardware has been produced. Firmware is in progress.

That said.. looking around twitter and other places.. it sounds like it's pretty functional. Here are your main resources:

I couldn't find many resources about how to buy these.. but here is what I got:

Presentations/etc:

Further reading:

Hardware: Facedancer 2.0 (~2017-2018+)

This repository houses the next generation of FaceDancer software. Descended from the original GoodFET-based FaceDancer, this repository provides a python module that provides expanded FaceDancer support-- including support for multiple boards and some pretty significant new features.

This is the v2.x of the facedancer, designed to be better/greater. I won't go too deeply into things, but the following are useful resources:

Presentations/Training/etc:

Commercial Hardware: TotalPhase BeagleUSB

TotalPhase are a company that provide a number of commercial hardware protocol analysers, including USB. I found that a number of the walkthroughs I would come across would at least mention these products in passing.

As I understand it, they are only good for passively reading/inspecting/logging the traffic, so no good if you want to do injection or other nefarious things.

They have a number of different products ranging from the relatively cheap (for low speed), up to the rather expensive (for USB 3.0):

Further Reading/Presentations

I figured I'd add this section for some other interesting presentations/resources that just didn't seem to fit nicely into the categories above. Some of them go a little beyond just USB hardware hacking, and into more general/specific hardware hacking tools:

People to Watch

While I was doing this research there were a few names that just kept popping up time and time again, and seem to be working on really cool things in this space. To make it easier to follow them on their relevent platforms, I wanted to collect them together here for you (in no particular order):

If I've missed anyone that you feel deserves to be here too, please let me know!

Code/Drivers/etc

So we know how to capture traffic from our devices, proxy it with hardware, break the protocols down and understand them. But we also want to be able to talk back to them, control them, and truly interact. This is where code and drivers comes in. Now we've sort of skimmed over these topics in a few of the above sections, but for the sake of clarity I wanted to group them all here as well.

When I first thought about writing this section I thought we were going to be getting deep into kernel drivers, and fighting with arcane systems, but it seems we actually have a much nicer alternative before all of that, thanks to libusb, pyusb, and friends:

You can see examples of using libusb/pyUSB in some of the walkthroughs mentioned earlier.

Now while these libraries give us a whole lot of power and makes it pretty easy to write our software, there may be times where they just don't quite cover what we need. That's when we can go deeper into the weird and wonderful world of driver development. I won't cover this too in-depth at the moment as it could be a whole blog series on it's own, but a few resources to get you started:

Where next? Device Emulation, USB over IP, etc

Now that you've figured out all of the intricacies of the device, understand it's protocol and wrote some software (or even a driver) to interface with it.. what about the other side of things?

  • Can we emulate the device in software (for testing, or other purposes)
  • Can we take the information from that device and stream it somewhere remotely?
  • Can we make a new hardware device that 'presents itself' as the device we just looked at? (eg. to interface with existing drivers/software)

This is where device emulation and USB over IP can come to the party. I haven't dug too deeply into this topic, but a well placed Google search or two (github usb over ip, github usb emulation) turned up some interesting looking resources (and I'm sure there are far more out there..):

Also, don't forget those hardware devices mentioned above that are designed for emulation..

Definitely an area that could be interesting to explore deeper, maybe in a future project/post.

IoT, Hardware Hacking, Fuzzing, etc

Once we understand the language these devices speak, how to listen to it, how to emulate it.. what's next? One idea is to apply the concept of fuzzing used in the software world (random/crafted data used to look for crashes in software), and turn it to hardware. And with the prevalence of IoT devices out there now (often with woeful security).. this could be another interesting rabbithole to explore (google: usb hardware fuzzing):

Link Dump

After all of that.. there is only one little link left in my linkdump, and from memory, I think it was the one that started this cascading flow of rabbitholes. Not really anything to see here that we haven't already covered, but for posterity:

Conclusion

Well.. that got longer than I expected! What originally started out as me wanting to dump a few links I was collecting as I read into this subject, we seem to have ended up with a rough reference guide to getting started on AllTheThings(tm) relating to USB reverse engineering and associated hardware hacking.

While this post by itself isn't going to give you all the answers, hopefully it's given you enough of a base that you can branch out and dig deeper into the aspects that interest you. And when you do, let me know what you build/break/discover!

Was there something I missed? A new shiny piece of hardware? An amazing program? Maybe you have some awesome techniques to share? Or just a story about what you've been able to do with this newfound knowledge? I'd love to hear from you in the comments below!