Immutable, verifiable CIDs are easy to cache
When working with off-chain references to NFTs (especially in contexts like gaming), there are three key properties that matter:
In previous blogs, we’ve argued about how to approach problems (1) and (2) - both of which are critical for ensuring is persisted and available with the guarantees users expect. Today, we’ll share an approach for leveraging the verifiability of content IDs (CIDs) used by IPFS to deliver on this third objective.
As a brief refresher - the URIs used by IPFS to reference content are centered around CIDs, which are hashes of the content using the IPLD data model. Hash based addresses mean we can reference, resolve, and verify content - regardless of where that content is actually stored.
When serving content over IPFS, the two primary methods for doing so are via:
Note that because in both cases the reference is to a CID - you don’t have to trust the content served to you is the content you requested (regardless of whether you ask for the data over a peer-to-peer network, or over HTTP), you as the recipient can verify that the content you’ve received back is actually the content you requested by checking to see if the data matches the CID!
CID’s are cryptographically verifiable, which gives them strong performance and safety properties when referenced from smart contracts, apps, and integrity checkers. Hashes of other values, such a transaction or signature, don’t offer the same benefit. By utilizing the verifiability of web3 storage, we can use Web2 infrastructure to access content with low latency.
Web2 has a number of tools and hosting services that exist to make performance fantastic for apps. Service providers like Cloudflare, Fastly, and Vercel include caching layers to improve performance. Static site builders like Gatsby and Astro generate static assets designed to play well with HTTP caching layers.
In most of these caching layers, developers tend to work with hashes of the underlying asset or state being cached. The unique referencing and idempotent operations which hashes enable can dramatically simplify state management (and help avoid nasty bugs!) especially when designing cache invalidation semantics.
Of course, this is a natural complement to the way in which CIDs are used as references inside the IPFS network! Applications can tune IPFS gateways for maximum performance by using the verifiable primitives of CIDs within the traditional infrastructure stack of web2. If you know your popular content, you can prime your gateway’s cache using the appropriate CIDs. Client requests can be served by the winner of a race between your caching layers and the IPFS DHT. From the user’s perspective this is transparent - as the CID allows the data they receive to be verified.
We’ve been putting the above into practice in multiple ways. From a storage standpoint, our APIs are run on Cloudflare Workers, which are edge servers deployed to cover the vast majority of the globe with sub-second latency. And you can take advantage of the performance and reliability this infrastructure provides you completely trustlessly. Because CIDs are computed on the client-side, you don't actually have to trust that the CID we give you back is the correct one. And once an upload is stored on Filecoin storage providers, you can always verify that these storage providers are actually storing your data by checking the Filecoin chain for their storage proofs. In fact, as NFT.Storage decentralizes itself as a service, you will even be able to verify the smart contracts enforcing long-term redundancy, renewal, and persistence.
On the retrieval side, we are soon launching an IPFS Gateway CDN that is optimized for NFT data. When a new NFT is processed by NFT.Storage (either because it was uploaded directly or because it was indexed off a public chain), its content (metadata.json, images, and other assets) is not only archived to Filecoin, it is also primed into the edge cache.
Our aim is for the resulting Gateway URLs to be the fastest way to load NFT assets and metadata. If you are building an NFT gallery or wallet, you can use our API to look up any NFT and retrieve an HTTP URL for it (that includes a CID). This way you can offer your users lightning fast performance with web3 cryptographic integrity, without managing your own crawler, archive, or CDN.
The gateway is open source, so if you have another application that could benefit from similar treatment, go ahead and spin one up. For instance, instead of preloading all the NFTs, yours could specialize in open source geographic or time series datasets, public domain music, or maybe the social media uploaded to your web2 app. In this way we expect our caching gateway to be useful even for legacy applications.