< Back

Design a What Protocol?

March 22, 2020 20:36   ~4  minutes

Protocol design always seemed daunting and complex to me until I actually started thinking about it more. With Marshal and PCLink, I couldn't avoid it and if I wanted to finish I'd have to find a way to make multiple devices communicate. Luckily I came across a blog post by Maya Posch (great read) that really demystified things and made everything click.

Start Simple

I took a look at my requirements for PCLink and realized my needs were not indepth. The needs for Marshal are a bit more complex and that's another story but in essence, I needed a way to send commands from a mobile device to a desktop application.

These commands could range from cordinates on where to move the mouse, button actions, or text input. With this in mind, I knew choosing a text based protocol would be easier. I opted for TCP sockets over UDP because with a connection, the desktop client will only be receiving data from mobile device it initially connected it.

From there it was all about grammar rules.

How do you say hi and please

Analogies are great for explaining ideas so if it helps, protocol design can be equated to grammar rules- the language is already there (whatever language you opted to use) and you're specifying how it's used to communicate with another system.

The inital handshake was straight forward- the desktop app starts and the mobile app scans the QR code to connect. After which it begins sending out various commands from the user.

After that was in place, I tried my luck with setting up pgp using python-gnupg and since I had a QR Code generated on server startup, I was able to feed the public key in and pass it to the mobile app when it's scanned.

The data requirements were pretty straight forward:

  • X,Y coordinates to update mouse positions

  • Single characters for keyboard input

  • The occasional special code for other characters and events like backspaces and button clicks, etc

I tried different approaches but I ultimately had to find a reliable way to send information and have it be interpretted properly.

If someone said Eat will hey Sam meeting you? before I'd be a bit confused because I'm not familiar with a language with that set of grammer rules. However, Hey will you eat before meeting Sam? sounds like a message I can interpret because I'm familiar with English grammar rules and can deciper the message based on that.

After I determined how I want to send and receive the information, I relized I missed an important aspect in grammer- punctuation! I was send commands through the tcp stream but I didn't indicate when one message ended and the other began. Rather then work on implementing some form of padding so each message can be expected with a certain length, I added a simple ; character to the end of each command.

It's used in common programming languages to end a line of code so I thought it would be fitting and it worked perfectly once I implemented a check for it. Now each message had its own start and end (the conetents beween each semi-colon).

After thoughts

After having all the peices laid out bewteen the desktop and mobile apps it was a matter putting it all together and testing the full _conversation_- Initiating a connection, transmiting data, and terminating the connection.

It was a pain to get Tkinter working with threads for some reason but I was able to test the edge cases as well as the format for the messages sent and it was ultimately a fun weekend task in the end.

The requirements for the Marshal project are a bit more complex since it's intented to be a distributed system as opposed to a client-server model, but I'm much more confident in at least finding a starting point!

Brian Ngobidi

Hi! I'm Brian Ngobidi, a technology consultant and security researcher based in New York. Thanks for reading!