🆘Informal social interactions are hard to track and even harder to trust.
Like all forms of informal data, informal social interactions - e.g., Telegram and Discord conversations - are difficult to track and trust.
Informal social interactions occur organically, their meaning highly contextual, and are much harder to track than just looking at a hashtag. Conversations coalesce around topics in ways that often aren't immediately obvious even to human readers. On top of that, these informal interactions are generated on such a vast scale every second of the day, making it impossible to track without the right technologies.
Even if you could track and comprehend such vast amounts of informal social interactions, they can never be trusted enough to enable automated mechanisms if the process is done by centralized players.
Informal social interactions are fundamental to reputations - not being able to trust or effectively track this informal data set creates a myriad of headaches.
😰Lack of trusted informal social interaction data makes life very difficult.
Ever wonder what's the difference between marketing (read: hyping) a project vs. throwing money into a black hole? There is none today. It's nearly impossible to track what impact your campaigns made automatically and at scale. If you cannot quantify or even verify the social impact, you are wasting money. Have you ever,
- run a social bounty, where 99.9% of rewards go to bounty hunters with zero audiences?
- held a Telegram AMA, where 100% of the questions asked were copy-pasted from other AMAs in a group filled with 100k bots?
- paid a Twitter shill, where 99% of the replies to their tweet are others shilling their own projects?
If you have, you know the pain.
What if you wanted to know what new coins are being talked about in Telegram? Or what new hot NFTs are getting airdropped in Discord? What if you wanted to know before everyone including your grandma apes in? Unless you spend your every waking moment trolling every single channel/forum/group/tweet, emerging trends discovery is almost pure luck.
Being able to spot trends is why the biggest VCs are so well-paid, but maybe they're just guessing like you and me?
🥳 Taraxa built something to make your lives better.
Taraxa Echo makes informal social interactions verifiable, quantifiable, and trustworthy.
Taraxa Echo uses decentralized data gathering, anchoring, and analytics layers to make informal social interactions trustworthy, verifiable, and quantifiable. This creates an infinite set of trusted and composable social signals that could be used to build a myriad of applications,
Here, we'll outline Echo's decentralized architecture. Since this is under ongoing development, expect changes down the road.
1. Echo Node
The Echo Node is the core of the decentralized data collection network. They're run by individual node runners, each collecting a subset of the overall social data coverage from multiple public, open social platforms, such as Telegram, Twitter, and Discord.
For example, in Telegram, a node is logged in with one account, and listens to many Telegram chat groups and channels at a time through its Ingestor. The Ingestor then organizes the data into a standardized data format, conducts a set of standardized Analytics pipeline (more on this in our next article), and then stores everything locally in the local Storage.
Besides collecting, analyzing, and storing social data, the Echo Node also periodically communicates with two external entities, IPFS, and the Echo Smart Contract.
2. IPFS
Echo Nodes will deposit their collected social data and standardized analytic results into the IPFS network, after which they'll receive a hash for the file upload. These hashes will be how DApps will be able to locate & access the data & analytics later on.
3. Echo Smart Contract
The entire Echo network communicates & collaborates through the Echo Smart Contract that sits on the Taraxa Layer-1 network. The Echo Smart Contract performs several critical functions,
- Coordinates Echo Nodes: the Echo Smart Contract at random intervals, randomly assigns & shuffles which social groups / accounts each Echo Node is supposed to listen in on, and ensures there's sufficient randomized redundancy (e.g., each Telegram group is listened in on by at least 5 Echo Nodes) so that there's a way to verify the output.
- Validates & Pays for Social Data: the Echo Smart Contract receives hashes (note: these are hashes of data, not the same as aforementioned IPFS hashes) intermittently submitted by Echo Nodes proving that they've collected data from their assignments. The randomized redundancy provides a basis to see if for the same data set, different nodes submitted the same hashes. Nodes that submitted hashes that fall in the majority are rewarded - e.g., if 4 out of 5 nodes submitted identical hashes, those 4 are rewarded, the remaining 1 is not.
- Processes data requests from DApps: external apps (e.g., Hype, Trend Spotter) will need to request data from the Echo network. They will submit their request into the Echo Smart Contract, and then the Smart Contract will route the request to the appropriate nodes with a deterministic mapping algorithm, so the contract doesn't need to maintain a list of node<>data mapping. Nodes then could submit encrypted (e.g., via hybrid cryptography) IPFS hashes to the requester upon, and then payment is released to the submitter.
Current Focus: Echo Node
As of this writing, our current focus is refining the Echo Node and making sure it can stably gather social data from various social media platforms, Telegram being the first network we're focused upon.
Once the Echo Nodes are able to reliably collect data, then we'll worry about decentralized & randomized orchestration. After that, work out the decentralized economics.
We'll end this brief intro with a final parting thought: that the economics of the Echo network need to be carefully designed to ensure that node runners have sufficient financial incentives to keep the network alive. Whenever data is being bought and sold, there's the risk that the first buyer will resell the data while undercutting the original seller, since the marginal cost is almost zero. In our case, resale value is significantly lowered in that the use cases rely on time-sensitive data, and the trustworthiness of the network is critical. We're optimistic about this not being a significant impediment, as other networks (e.g., ChainLink) also face similar issues, but seem to be doing just fine. 😄
Stay Tuned!