(Scroll to bottom for video link if the above is not working)

The Problem We Are Solving: Standard military-grade encryption has a fatal operational flaw: it is highly identifiable. If a journalist or activist sends an AES-256 encrypted message over a monitored network, the firewall doesn’t need to crack the math to know something is being hidden. The algorithm instantly flags the cipher text (e.g., U2FsdGV...) as an anomaly, blocks the message, and targets the sender. Encryption protects the data, but it exposes the user. We needed a way to transmit secure data with perfect plausible deniability.

The Solution: Ghost Protocol is a behavioural steganography engine that encrypts and encodes sensitive messages inside completely natural looking emojis embedded within standard text posts, trained on historical Twitter datasets.

Instead of posting suspicious blocks of cipher text, users type their intended hidden message and a "normal" block of text (that will be made public), and our engine securely encodes the hidden message into emojis embedded into the block of text. Crucially, each user generates their own unique encryption key and gets a custom emoji dictionary, which is built on historical data scraped from the user's social profile. This means the camouflage is uniquely tailored to the individual, making it an essential tool for whistleblowers, dissidents, or anyone operating in hostile digital environments where traditional encryption is banned or monitored. The process is simple yet effective, as the user can provide their encryption key to the intended recipients of their message, whilst not raising any suspicion amongst third parties.

Challenges We Ran Into: Preventing Reverse Engineering: Our first major hurdle was ensuring the steganography couldn't be easily decoded by a third party. We realised a static emoji dictionary was too vulnerable. To solve this, we customised the encryption process to the specific user. By scraping their social media profile, the engine builds a dynamic, bespoke emoji dictionary combining their historically most-used emojis with current trending ones.

The "Randomness" Red Flag: Initially, the engine was dropping emojis into text in a way that looked completely random and algorithmic, which immediately looked suspicious. We had to heavily refine the distribution and formatting logic to ensure the output messages maintained a natural feel and didn't trigger spam filters or human suspicion.

How We Can Extend This: Multi-Platform Profiling: Right now, the bespoke dictionary relies on a single data source (Twitter or Bluesky profiles). We plan to extend our scrapers to ingest a wider range of social media profiles (e.g. Reddit, Discord, Instagram) to build a much stronger, cross-platform behavioural footprint.

Maximising Conciseness: Because true encryption adds data overhead, the word-to-emoji ratio can currently get high for longer payloads. Our next major architectural update will focus on optimising the stream cipher to compress the payload, allowing us to hide much larger amounts of data inside far fewer emojis.

Video demo link if you are having issues: https://youtu.be/U6yxIyp0_Gg

Built With

Share this project:

Updates