A rose by any other name

Source: iStock

TruSet is building multi-sided marketplaces for communities to collect, validate, and share critical reference data. Their Token Data Beta launches December 6, 2018.

In a market we interact with others, and must agree on the thing we’re talking about — the stock, the bond, the token — before we can do anything else. Everyone needs shared references, names. Unique identification (naming) is the primary use case for reference data.

The absence of a reliable identifier opens a market up to a host of problems. There are the obvious issues of scams, fake instruments and mistakes. But the larger problem lies in the excess costs and hidden chaos caused by the need for each participant to manage their own identifiers, and their reliance on others to do the same.

Market data services must constantly keep an eye on their sources to make sure they refer to the same instrument. Exchanges and brokers must clearly identify their listings to their clients. Analysts must check every source of disclosure and market information to ensure it’s actually about what they’re analyzing. Accounting and risk systems must avoid polluting their databases with irrelevant data to avoid erroneous calculations. Every failure to maintain clear references by one service pollutes a whole downstream ecosystem that relies on or incorporates its data. The garbage-in-garbage-out (GIGO) problem grows exponentially across the marketplace, putting retail investors at unnecessary and unknown risk, and scaring off institutional money.

Mappings, mappings and more mappings (and people, people and more people).

Every failure to maintain clear references by one service pollutes a whole downstream ecosystem that relies on or incorporates its data.

Where’s the Data Problem in the Token Ecosystem?

When you and I talk about a stock, currency, credit derivative, or bond, we both know what we’re talking about, right? I mean, AAPL is Apple Inc.’s shares, GBP/USD is the pound/dollar currency pair. It gets a little more complex in some markets, but credit market professionals will instantly know what ITRAXX-ASIAXJIGS30V1–5Y is, and two bond traders can quickly confirm that the “Boeing 2.8s of 23” they’re haggling over are 097023BW4. So why shouldn’t the emerging token economy simply adopt the practices of traditional financial markets?

Well, we do appear to have adopted two such practices. We give tokens symbols like BTC, ETH, ZRX or EOS. When pricing or trading, we mimic the foreign exchange (FX) market practice of listing exchangeable pairs by symbol (BTC/USD, TRX/ETH). Simple.

Except for two issues. First, we’ve adopted the outputs of traditional market practices without adopting the underlying practices that make them function. Second, the practices we’re mimicking don’t actually work as well as we think.

To truly solve this problem, we need to address three related issues:

  • Centralized certification
  • Ambiguity
  • Intersubjectivity
We’ve adopted the outputs of traditional market practices without adopting the underlying practices that make them function.

The Problem of (Multiple) Centralized Certification

We often think of token symbols as akin to currency or equity symbols. These seem to do a fair job identifying instruments in traditional markets.

However, neither currency nor equity symbols are assigned by the issuers/creators of the instruments, the way token projects assign symbols. Instead, there are central bodies intimately involved in both markets.

In the FX world, fiat currencies are known by their ISO 4217 codes — the familiar USD, ZAR, and BTN. Countries and their central banks don’t pick these codes. They follow a formula set out in the ISO 4217 standard, maintained by SIX Interbank Clearing Ltd. So when a central bank or country issues a new currency (such as the Euro), it must be recognized by ISO and its designated maintenance agency, and meet their criteria, in order to receive a code. Failure to do so does not mean the currency does not exist — the Guernsey Pound (GGP), for instance — but it does mean that integration into the global FX markets is severely limited. This process actually works pretty well, mostly because there are only a few hundred fiat currencies and their issuers are pretty well known.

In equity markets, securities are generally listed on an exchange, which provides the identifying symbol. It’s not that the issuer has no say. When applying to be listed on a US exchange, for example, the national market system (NMS) allows issuers to reserve symbols (tickers), and has a complex procedure in place for allocating them. This is why some companies have memorable tickers that work with their brand, like YUM, BUD, and HOG. But ultimately, it is still an application — it must be approved by the exchange.

Unique Identifiers in the Traditional Equity Market

If exchanges and not issuers provide equity symbols, how does this create a unique identifier for the security?

It doesn’t. Only the combination of identifier (symbol) and identification scheme is really unique.[1] So how are securities identified in the traditional equity market?

Join me down the rabbit hole…

The majority of companies choose to list on one exchange, which explains the misperception that symbols (tickers) are unique identifiers. But there’s no requirement to do so, and companies can and do cross-list. Apple may be AAPL, but HSBC is HSBA (LSE), 5 (HKEX), HSBC.BH(BSX), and HSB (Euronext). To make matters more complicated, companies listed outside the US will often issue an ADR to gain access to US capital markets. This is still equity, but is a separate security from the ordinary shares. So HSBC equity is also HSBC on the NYSE, but you’re not buying the same ordinary shares as HSBA.

It is also possible for different exchanges to assign the same symbol to very different companies. Maybe you have some investment thesis about the future of energy, you might want to investigate SOL. Just don’t confuse the NYSE (SOL) with the JSE (SOL), it could throw off your game.

This can all get a bit confusing, particularly for international transactions. Queue the standards bodies and SROs to create national and transnational identifiers. The UK has SEDOLs maintained by the LSE. The US and Canada have CUSIPs maintained by S&P-managed CGS. Similar solutions exist for most markets. So now we have a nice, clean national identifier for securities. HSBC’s common stock is SEDOL 0540528, and also WKN (in Germany) 923893. Its ADR is CUSIP 404280406 and WKN 924153.

That solves some problems, but doesn’t really help internationally. So ISO steps up with the ISO 6166 ISIN. Now we’re talking! An international standard for a global identifier, administered by local numbering agencies, that everyone can use to uniquely identify securities. Problem solved! Finally, I can have exactly two identifiers for my two HSBC equity securities — GB0005405286 for the ordinary shares, and US4042804066 for the ADR. Thank you ANNA.

And don’t worry — CGS will happily license its file of all the CUSIPs and ISINs it maintains, and for just $477,750 a year you can use it for as many securities and as many business purposes as you like. If you’re thinking this seems to violate the spirit and purpose of having a shared global identifier in the first place, you’re not alone. Yet here we are. Now that’s for US and Canadian securities — you can go elsewhere for the rest.

Of course, if the security happens not to trade internationally (as is the case with many bonds), it may not have an ISIN at all, since the issuer probably doesn’t want the expense of getting one. Oops.

The Proliferation of Golden Copies

Source: Unsplash

Now that’s a mess of identifiers to gather and maintain. Or more precisely, a mapping of references. You can maintain it yourself (people), or you can outsource it to a specialist. Private sector to the rescue! Turns out you can “buy”’ reference data from a vendor, and the big three [2] probably have coverage for most of the traditional securities you’re looking for. They’ll rent you their data, adding their fee on top of any redistribution costs for CUSIPs and the like. [3]

Of course, you may want or need to add a few smaller vendors for coverage, particularly if you’re involved in niche markets. And what if some of your counterparties are using different vendors, and their data doesn’t agree with yours? Better keep a local golden copy, so you can track and fix any vendor errors and data mismatches. [4]

Mappings, mappings, and more mappings (and people, people and more people). We’ve just spent a ton of money — why does it feel like we’re still in the same place?

If the equity markets sound like a bit of a mess to you, the situation is much, much worse in the bond market. Add tranches, seasoning, fungibility, and the recycling of identifiers into this picture, layer in a few orders of magnitude more securities to worry about, and the fun really starts. I used to think those equity guys really had their stuff together.

Ambiguous Symbols in the Token Economy

If centralized certification of identity is not a good solution, what about just decentralizing everything? Maybe if we just let everyone choose their own identifiers, this will sort itself out? That’s where the token economy is right now, so let’s check in on how it’s doing.

  • Token symbols are especially ambiguous, with a single symbol often signifying multiple tokens, and a single token sometimes going by several symbols. The reasons are all perfectly normal parts of how the decentralized economy is developing:
  • Token creators choose their own symbols, inevitably leading to duplicates across both legitimate and scam projects (my favorites are BTM and GOLD).
  • The ICO model of raising capital for new blockchain projects often involves a token swap from an ICO token (typically on the Ethereum chain) to the final project token, usually known by the same symbol (EOS, TRX).
  • Blockchain forks can split tokens into multiple instruments sharing the same original symbol, and the associated community debates can sometimes change an instrument’s commonly used symbol (Bitcoin Cash).
  • Like traditional exchanges, market participants decide on what symbol to use independently, leading to the same token being referred to by multiple symbols (Bitcoin as BTC and XBT).
  • The existence of multiple networks and traditional financial markets can lead to the use of the same symbol to signify not only different tokens, but also a token and a traditional security (ETHBCH).

With no central authority to confer “official” status on one token’s symbol over another, duplicates will persist and grow as a perfectly normal and legitimate part of the token economy. [5] So is there a way to uniquely identify tokens that does not rely on a central authority?

What Properties Uniquely Define a Token?

We’re dealing with two quite different classes of “thing” when we ask this question.

In some cases like Gnosis (GNO) or Tether (USDT), we’re talking about non-native “secondary” tokens implemented on top of an underlying blockchain or platform (Ethereum, Omni). Each underlying platform necessarily has its own “key” to identify the token, typically an address or account that implements it. For Ethereum, this is the token’s smart contract address — for Omni, the issuer’s bitcoin address. The underlying platform needs this unique identifier to support multiple tokens in code, so it will be there.

That’s great, all we need to know for non-native tokens is:

  • The fact that it’s a non-native token
  • The underlying platform it’s implemented on
  • The unique address of the token on the platform

In other cases, we’re talking about “primary” units of value built into the platform itself, like Bitcoin, Ether, or Zcash. These native tokens are sometimes referred to as coins or cryptocurrencies, and constitute the majority of the most actively traded and highest market cap tokens. The token itself does not have a representation outside of the platform. So Bitcoin is simply the native “coin” of the Bitcoin network, as Ether is for Ethereum network. To uniquely identify one of these “coins”, we need to know two properties:

  • The fact that it is a native token
  • The underlying platform it is native to

Some native tokens will additionally have a unique identifier on the platform (NEO), but that is unnecessary if we already know it’s native and have identified the platform.

So we’re done? We have determined the properties we need to know, there’s only three of them, and it’s just a case of each market participant (or vendor) creating their lists and assigning internal keys, and we’re good to go — everyone will know what tokens they’re discussing/transacting/pricing etc., and we can all move on? We’ll continue to use symbols and names for human convenience, but make sure we have these three properties available to abolish ambiguity. This is about where the token economy is right now, albeit with some participants being better actors than others. And we don’t have to take any of that nasty expensive centralized abuse. Isn’t decentralization grande?

Hold up. Take another look at those two sets of properties, and the lynchpin in both of them. We need to know the underlying platform. We need to know if it’s Omni, Ethereum, EOS or *gasp* Bitcoin Cash. And this, sadly, is not an objective fact.

Chains and Intersubjectivity in the Cryptosphere

Ahh, intersubjectivity, that most gregarious of epistemological concepts. Why does this have anything to do with blockchains and crypto, where code is law and objective machines reach consensus according to protocols of clear, logical rules? Surely part of the point of the token economy is to do away with all this intersubjective nonsense and the messy human conflicts and uncertainty it brings?

Nic Carter wrote an excellent blog post which explores the irreducibly intersubjective nature of blockchain identity, so I’ll pass this question off to him to examine far better than I. Suffice it to say that while a given network’s existence may be a purely objective fact, its identity is not.

We’re watching this in action right now with Bitcoin Cash, as rival chains (which objectively exist) battle it out for the right to claim the name. Whether one of them wins the prize and the other becomes something else (as the original Bitcoin Cash itself did), or one disappears, or we end up with all new identities and no Bitcoin Cash, is a matter of human, not machine consensus. Intersubjective consensus, to be precise.

So there you are. What is to be done? As Nic explains, we could do what humans have usually done in the face of this problem, and try to hand it off to some centralized authority. But I hope I’ve shown why this is a horrible solution. To quote Nic again:

The other approach is to throw caution to the wind and spurn any external marker of identity, relying instead on an intersubjective consensus, such that the system can change over time while remaining faithful to its original goals.

That’s all well and good from a philosophical perspective, but we’re still trying to operate in the world of code, finance, engineers, and financial professionals. Throwing caution to the wind is not the preferred modus operandi. Is there a way to produce an external marker of identity that doesn’t rely on a central authority, yet still bridges intersubjective consensus and the need of the market for an objective identifier?

Source: https://icanhas.cheezburger.com/

If we could capture and maintain the shared intersubjective consensus in an objectively existing data record, and then assign to that record an objectively existing key, we could hang all the individual mappings we need off it.

This is basically what reference data vendors try to do in traditional markets, but their records are necessarily subjective rather than intersubjective. What’s needed is a shared data record, shared in the sense that both content maintenance and acceptance is a group effort. Sounds like a job for a smart contract.

The TruSet Solution: Consensus in the Code

At TruSet we’ve built a system for community publication and validation of reference data about tokens. Since the community both provides and validates the data (for a reward), the resulting records represent the intersubjective consensus of that community.

In our token data model’s description section, you’ll find all the fields necessary to uniquely identify a token or cryptocurrency — the properties discussed above. The community can agree on and maintain this set of fields, modifying it as needed. This consensus is memorialized in a smart contract, thus available to all members of the community at all times. And here’s the sweet part. The smart contract (i.e. the record) has an objective address on the Ethereum blockchain. [6] So that smart contract address itself serves as an objective external marker of the intersubjective consensus of the TruSet community.[7]

And TruSet the organization is not in charge of centrally certifying or assigning it — the users and the Ethereum code are. Veni, vidi, vici. Better yet: venimus, vidimus, nos vicit.

We came. We saw. We conquered.

If you are interested in becoming part of this experiment and maintaining trusted data for the token ecosystem, check us out at www.truset.com, or better yet, sign up for our Token Data Beta, which launches this week! Stay tuned for a complete platform summary in coming weeks.


[1] In boring old relational database terms, tickers and their ilk have a many-to-many relationship with the instruments they represent. I realize that relational databases are not de rigueur these days, and as someone working on a Web 3.0 project, I have no business using such outmoded concepts, to which I say two things. First, I’m old and grumpy and remember Web 1.0, you whippersnappers. Second, do you know of a widely adopted reference data system that is not implemented in a relational database?

[2] Bloomberg, Thomson Reuters and Interactive Data/ICE

[3] Contact them individually for pricing, it depends — I can feel the shivers running down your spine.

[4] What’s a ‘“golden copy”’, you ask? Why, it’s your locally maintained mapping, of course!

[5] Unfortunately, software programs and many software engineers seriously dislike ambiguity. It’s hard to handle and inefficient, leading to unwanted complexity, bugs, and hard to detect errors in data. And if there’s one community that hates ambiguity even more than engineers, it’s probably financial professionals. Not only does it cause headaches and errors, it does so with people’s money, which leads to a lot of unhappiness (and lawsuits, fines, etc).

[6] OK, on the Rinkeby testnet for now. We’re still in Beta, after all.

[7] For those following the epistemology carefully, yes, I recognize that we still need to intersubjectively agree on the identity of the Ethereum (Rinkeby) chain itself, so the address is not a fully objective identifier. In summarizing Nic Carter, I think I used the term irreducibly intersubjective. So a philosophical flaw, it’s still turtles all the way down, and a good thing I’m not a grad student anymore. However, for those more interested in practical applications, we’ve shrunk the number of turtles to 1, and at least for the moment a pretty easy one to handle.

Disclaimer: The views expressed by the author above do not necessarily represent the views of Consensys AG. ConsenSys is a decentralized community with ConsenSys Media being a platform for members to freely express their diverse ideas and perspectives. To learn more about ConsenSys and Ethereum, please visit our website.


Token Identification in the Cryptomarkets was originally published in ConsenSys Media on Medium, where people are continuing the conversation by highlighting and responding to this story.