In-depth Research Report (Part II): Analysis of the integration status, competitive landscape and future opportunities of AI and Web3 data industry

The emergence of GPT has attracted the world’s attention to large language models, and all walks of life are trying to use this “black technology” to improve work efficiency and accelerate the development of the industry. Future3 Campus and Footprint Analytics jointly conducted an in-depth study on the infinite possibilities of the combination of AI and Web3, and jointly released a research report entitled “Analysis of the Integration Status, Competitive Landscape and Future Opportunities of AI and Web3 Data Industry”. The research report is divided into two parts, and this article is the second part, edited by Future3 Campus researchers Sherry and Humphrey.

Summary:

  • The combination of AI and Web3 data is driving data processing efficiency and user experience. At present, the exploration of LLM in the blockchain data industry mainly focuses on improving data processing efficiency through AI technology, building AI agents by using the interactive advantages of LLMs, and using AI for pricing and trading strategy analysis.
  • At present, the application of AI in the field of Web3 data still faces some challenges, such as accuracy, explainability, commercialization, etc. There is still a long way to go before human intervention is completely replaced.
  • The core competitiveness of Web3 data companies lies not only in AI technology itself, but also in data accumulation capabilities and in-depth analysis and application capabilities of data.
  • AI may not be the solution to the problem of commercialization of data products in the short term, and commercialization will require more productization efforts.

The current situation and development route of the combination of Web3 data industry and AI

1.1 Dune

Dune is currently the leading open data analytics community in the Web3 industry, providing blockchain tools for querying, extracting, and visualizing large amounts of data, allowing users and data analytics experts to query on-chain data from Dune’s pre-populated database using simple SQL queries and form corresponding charts and opinions.

In March 2023, Dune presented plans for AI and the future of incorporating LLMs, and in October released its Dune AI product. The core focus of Dune’s AI-related products is to augment the Wizard UX with the powerful linguistic and analytical capabilities of LLMs to better provide users with data queries and SQL writing on Dune.

(1) Query interpretation: The product released in March allows users to obtain natural language explanations of SQL queries by clicking a button, which is designed to help users better understand complex SQL queries, thereby improving the efficiency and accuracy of data analysis.

(2) Query translation: Dune plans to migrate different SQL query engines (such as Postgres and Spark SQL) on Dune to DuneSQL, so LLMs can provide automated query language translation capabilities to help users make a better transition and facilitate the implementation of DuneSQL products.

(3) Natural language query: Dune AI, which was released in October. Allows users to ask questions and get data in plain English. The goal of this feature is to make it easy for users who don’t need SQL knowledge to access and analyze data.

(4) Search optimization: Dune plans to use LLMs to improve search capabilities and help users filter information more effectively.

(5) Wizard knowledge base: Dune plans to release a chatbot to help users quickly navigate blockchain and SQL knowledge in Spellbook and Dune documentation.

(6) Simplifying SQL writing (Dune Wand) :D une launched the Wand series of SQL tools in August. Create Wand allows users to generate complete queries from natural language prompts, Edit Wand allows users to make modifications to existing queries, and the Debug feature automatically debugs syntax errors in queries. At the heart of these tools are LLM technology, which simplifies the query writing process and allows analysts to focus on the core logic of analyzing data without having to worry about code and syntax.

1.2 Footprint Analytics

Footprint Analytics is a blockchain data solution provider that provides a no-code data analytics platform, a unified data API product, and Footprint Growth Analytics, a BI platform for Web3 projects, with the help of artificial intelligence technology.

The advantage of Footprint lies in the creation of its on-chain data production line and ecological tools, and its establishment of a unified data lake to open up the metadatabase of on-chain and off-chain data and on-chain industrial and commercial registration, so as to ensure the accessibility, ease of use and quality of data when users analyze and use. Footprint’s long-term strategy will focus on technology depth and platform building to create a “machine factory” capable of producing on-chain data and applications.

Footprint products are combined with AI as follows:

Since the launch of the LLM model, Footprint has been exploring the combination of existing data products and AI to improve the efficiency of data processing and analysis and create a more user-friendly product. In May 2023, Footprint began to provide users with data analysis capabilities for natural language interaction, and upgraded to high-end product features on the basis of its original no-code, allowing users to quickly obtain data and generate charts through conversations without being familiar with the platform’s tables and design.

In addition, the current LLM + Web3 data products in the market are mainly focused on solving the problems of lowering the threshold for user use and changing the interaction paradigm, and the focus of Footprint in the development of products and AI is not only to help users solve the problem of data analysis and user experience, but also to precipitate vertical data and business understanding in the crypto field, as well as train language models in the crypto domain to improve the efficiency and accuracy of vertical scene applications. Footprint’s strengths in this regard will be reflected in the following areas:

  • Quantity of data knowledge (quality and quantity of knowledge base). The efficiency of data accumulation, source, quantity, and category. In particular, the Footprint MetaMosaic sub-product embodies the accumulation of relationship graphs and static data for specific business logic.
  • Knowledge architecture. Footprint has accumulated more than 30 public chains, abstracted structured data tables by business section. Knowledge of the production process from raw data to structured data can in turn strengthen the understanding of raw data and better train models.
  • Data type. There is a significant gap in the efficiency of training and machine cost from the training of non-standard and unstructured raw data on the chain, as well as from the training of structured and business-meaningful data tables and metrics. A typical example is the need to provide more data to the LLM, which requires more readable and structured data in addition to professional data based on the encryption field, and a larger number of users as feedback data.
  • Crypto money flow data. Footprint abstracts the capital flow data closely related to investment, which includes the time, subject (including flow), token type, amount (token price at the associated point in time), business type, and tags of tokens and entities, which can be used as a knowledge base and data source for LLM to analyze the main funds of tokens, locate chip distribution, monitor fund flow, identify on-chain changes, track smart funds, etc.
  • Injection of private data. Footprint divides the model into three layers, one is the base model with World knowledge (OpenAI and other open-source models), the vertical model of subdivided domains, and the personalized expert knowledge model. It allows users to unify their knowledge bases from different sources on Footprint for management, and use private data to train private LLMs, which is suitable for more personalized application scenarios.

In the exploration of Footprint combined with LLM model, a series of challenges and problems have also been encountered, the most typical of which are insufficient tokens, time-consuming prompts, and unstable answers. The bigger challenge faced by the vertical field of on-chain data where Footprint is located is that there are many types of on-chain data entities, a large number of them, and rapid changes, and the form in which to feed them to LLMs requires more research and exploration by the entire industry. The current tool chain is still relatively early, and more tools are needed to solve some specific problems.

The future of Footprint’s integration with AI in technology and products includes the following:

(1) In terms of technology, Footprint will be explored and optimized in three aspects in combination with the LLM model

  • Support LLM for inference on structured data, so that a large amount of structured data and knowledge in the encrypted field can be applied to the data consumption and production of LLM.
  • Help users build a personalized knowledge base (including knowledge, data, and experience), and use private data to improve the ability of optimized crypto LLMs, so that everyone can build their own models.
  • With AI-assisted analysis and content production, users can create their own GPT through dialogue, combined with fund flow data and private knowledge base, to produce and share crypto investment content.

(2) In terms of products, Footprint will focus on exploring the application of AI products and the innovation of business models. According to Footprint’s recent promotion plan for the product, it will launch an AI crypto content generation and sharing platform for users.

In addition, for the expansion of future partners, Footprint will explore the following two aspects:

First, strengthen cooperation with KOLs to help the production of valuable content, the operation of the community, and the monetization of knowledge.

Second, expand more cooperative project parties and data providers, create an open and win-win user incentive and data cooperation, and establish a mutually beneficial and win-win one-stop data service platform.

1.3 GoPlus SecurityGoplus

GoPlus Security is currently the leading user security infrastructure in the Web3 industry, providing a variety of user-facing security services. At present, it has been integrated with mainstream digital wallets, market websites, Dex, and various other Web3 applications on the market. Users can directly use various security protection features such as asset security detection, transfer authorization, and anti-phishing. GoPlus provides user security solutions that cover the entire user security lifecycle to protect user assets from various types of attackers.

The development and planning of GoPlus and AI are as follows:

GoPlus’s main exploration in AI technology is reflected in its two products: AI Automated Detection and AI Security Assistant:

(1) AI automatic detection

Since 2022, GoPlus has developed its own AI-based automated detection engine to comprehensively improve the efficiency and accuracy of security detection. GoPlus’ security engine uses a multi-layered, funnel approach to static code detection, dynamic detection, and feature or behavior detection. This composite detection process allows the engine to effectively identify and analyze the characteristics of potentially risky samples to effectively model attack types and behaviors. These models are key to the engine’s identification and prevention of security threats, and they help the engine determine if a risk sample has some specific attack signature. In addition, after a long period of iteration and optimization, the GoPlus security engine has accumulated a wealth of security data and experience, and its architecture can quickly and effectively respond to emerging security threats, ensure that various complex and new attacks can be detected and blocked in a timely manner, and users can be protected in an all-round way. At present, the engine uses AI-related algorithms and technologies in multiple security scenarios such as risky contract detection, phishing website detection, malicious address detection, and risky transaction detection. On the other hand, it reduces the complexity and time cost of manual participation, and improves the accuracy of risk sample judgment, especially for new scenarios that are difficult to define manually or difficult to be identified by engines, AI can better aggregate features and form more effective analysis methods**.

In 2023, as large models evolved, GoPlus quickly adapted and adopted LLMs. Compared with traditional AI algorithms, LLMs are significantly more efficient and effective in data identification, processing, and analysis. In the direction of dynamic fuzz testing, GoPlus uses LLM technology to effectively generate transaction sequences and explore deeper states to discover contract risks.

(2) AI security assistant

GoPlus is also developing AI security assistants that leverage LLM-based natural language processing capabilities to provide instant security consulting and improve user experience. Based on the GPT large model, the AI assistant has developed a set of self-developed user security agents through the input of front-end business data, which can automatically analyze, generate solutions, disassemble tasks, and execute according to problems, and provide users with the security services they need. AI assistants simplify communication between users and security issues, lowering the barrier to understanding.

In terms of product functions, due to the importance of AI in the field of security, AI has the potential to completely change the structure of existing security engines or virus anti-virus engines in the future, and a new engine architecture with AI as the core will appear. GoPlus will continue to train and optimize its AI models to transform AI from an assistive tool to the core functionality of its security detection engine.

In terms of business model, although GoPlus’s services are currently mainly for developers and project parties, the company is exploring more products and services directly for C-end users, as well as new revenue models related to AI. Providing efficient, accurate, and low-cost C-end services will be GoPlus’s core competitiveness in the future. This will require companies to continue to research and do more training and output on large AI models that interact with users. At the same time, GoPlus will also collaborate with other teams to share its security data and drive AI applications in the security space through collaboration to prepare for possible future industry changes.

1.4 Trusta Labs

Founded in 2022, Trusta Labs is an AI-powered data startup in the Web3 space. Trusta Labs focuses on the efficient processing and accurate analysis of blockchain data using advanced artificial intelligence technology to build the on-chain reputation and security infrastructure of the blockchain. Currently, Trusta Labs’ business consists of two main products: TrustScan and TrustGo.

(1) TrustScan, TrustScan is a product designed for B-end customers, mainly used to help Web3 projects analyze on-chain user behavior and refine layering in terms of user acquisition, user activity and user retention, so as to identify high-value and real users.

(2) TrustGo, a product for C-end customers, provides a MEDIA analysis tool that can analyze and evaluate on-chain addresses from five dimensions (fund amount, activity, diversity, identity rights, and loyalty), and the product emphasizes the in-depth analysis of on-chain data to improve the quality and security of transaction decisions.

The development and planning of Trusta Labs and AI are as follows:

At present, Trusta Labs’ two products use AI models to process and analyze the interaction data of on-chain addresses. The behavioral data of address interactions on the blockchain is a sequence of data, which is very suitable for the training of AI models. In the process of cleaning, organizing, and labeling on-chain data, Trusta Labs hands over a lot of the work to AI, which greatly improves the quality and efficiency of data processing, while also reducing a lot of labor costs. Trusta Labs uses AI technology to conduct in-depth analysis and mining of on-chain address interaction data, which can effectively identify a more likely Witch address for B-end customers. Tursta Labs has been able to prevent potential Sybil attacks in a number of projects that have used Tursta Labs products, and for C-end customers, TrustGo has leveraged existing AI models to help users gain insight into their on-chain behavior data.

Trusta Labs has been closely following the technical progress and application practices of LLM models. As the cost of model training and inference continues to decrease, as well as the accumulation of a large amount of corpus and user behavior data in the Web3 field, Trusta Labs will look for the right time to introduce LLM technology and use the productivity of AI to provide deeper data mining and analysis capabilities for products and users. On the basis of the abundant data already provided by Trusta Labs, it is hoped that the intelligent analysis model of AI can be used to provide more reasonable and objective data interpretation functions for the data results, such as providing qualitative and quantitative interpretation of the analysis of the captured Sybil account for B-end users, so that users can better understand the analysis of the reasons behind the data, and at the same time, it can provide more detailed material support for B-end users when they complain and explain to their customers.

On the other hand, Trusta Labs also plans to use open source or mature LLM models and combine intent-centric design concepts to build AI agents to help users solve on-chain interaction problems more quickly and efficiently. In terms of specific application scenarios, in the future, through the AI Agent intelligent assistant based on LLM training provided by Trusta Labs, users can communicate with the intelligent assistant directly through natural language, and the intelligent assistant can “intelligently” feedback information related to the data on the chain, and make suggestions and plans for follow-up operations based on the information provided, truly realizing one-stop intelligent operation centered on user intent, greatly reducing the threshold for users to use data, and simplifying the execution of on-chain operations.

In addition, Trusta believes that with the emergence of more and more AI-based data products in the future, the core competitive factor of each product may not be the LLM model used, but the key competitive factor is a deeper understanding and interpretation of the data already mastered. Based on the analysis of the mastered data, combined with LLM models, more “smart” AI models can be trained.

1.5 0xScope

0xScope, founded in 2022, is a data-centric innovation platform focused on the combination of blockchain technology and artificial intelligence. 0xScope aims to change the way people process, use, and look at data. 0xScope is currently available for B-side and C-side customers: 0xScope SaaS products and 0xScopescan.

(1) 0xScope SaaS products, a SaaS solution for enterprises, empowers enterprise customers to conduct post-investment management, make better investment decisions, understand user behavior, and closely monitor competitive dynamics.

and (2) 0xScopescan, a B2C product that allows cryptocurrency traders to investigate the flow and activity of funds on selected blockchains.

0xScope’s business focus is to use on-chain data to abstract a common data model, simplify on-chain data analysis, and transform on-chain data into understandable on-chain operational data, so as to help users conduct in-depth analysis of on-chain data**. Using the data tool platform provided by 0xScope, it can not only improve the quality of data on the chain, mine the hidden information of the data, so as to reveal more information to users, but also greatly reduce the threshold of data mining.

The development and planning of 0xScope and AI are as follows:

0xScope’s products are being upgraded in combination with large models, which includes two directions: first, to further reduce the threshold for users through natural language interaction, and second, to use AI models to improve processing efficiency in data cleaning, analysis, modeling and analysis. At the same time, 0xScope’s products will soon launch an AI interactive module with Chat function, which will greatly reduce the threshold for users to query and analyze data, and interact and query with the underlying data only through natural language.

However, in the process of training and using AI, 0xScope found that it still faced the following challenges: First, the cost and time cost of AI training were high. After asking a question, it takes a long time for the AI to respond**. As a result, this difficulty forces teams to streamline and focus on business processes and focus on vertical Q&A, rather than making it an all-around super AI assistant. Second, the output of the LLM model is uncontrollable. **Data products hope to give accurate results, but the results given by the current LLM model are likely to be different from the actual situation, which is very fatal to the experience of data products. In addition, the output of the large model may involve the user’s private data. Therefore, when using the LLM pattern in the product, the team needs to limit it to a large extent so that the output of the AI model can be controlled and accurate.

In the future, 0xScope plans to use AI to focus on specific vertical tracks and deepen their cultivation. At present, based on the accumulation of a large amount of on-chain data, 0xScope can define the identity of on-chain users, and will continue to use AI tools to abstract on-chain user behavior, and then create a unique data modeling system, through which the hidden information of on-chain data is revealed.

In terms of cooperation, 0xScope will focus on two types of groups: the first category, the objects that the product can directly serve, such as developers, project parties, VC, exchanges, etc., which need the data provided by the current product, and the second category, partners who have a need for AI Chat, such as Debank, Chainbase, etc., only need relevant knowledge and data to directly call AI Chat.

VC insight - the commercialization and future development of AI+Web3 data companies

Through interviews with 4 senior VC investors, this section will look at the current situation and development of the AI+Web3 data industry, the core competitiveness of Web3 data companies, and the future commercialization path from the perspective of investment and market.

2.1 Current situation and development of AI+Web3 data industry

At present, the combination of AI and Web3 data is in a stage of active exploration, and from the perspective of the development direction of various leading Web3 data companies, the combination of AI technology and LLM is an indispensable trend. But at the same time, LLMs have their own technical limitations and cannot solve many of the problems of the current data industry.

Therefore, we need to recognize that it is not necessary to blindly combine with AI to enhance the benefits of a project, or to use AI concepts for hype, but to explore areas of application that are truly practical and promising. From the perspective of VC, the combination of AI and Web3 data has been explored in the following aspects:

(1) Improve the capabilities of Web3 data products through AI technology, including AI technology to help enterprises improve the efficiency of internal data processing and analysis, and correspondingly improve the ability to automatically analyze and retrieve users’ data products. **For example, Yuxing of SevenX Ventures mentioned that the main help of using AI technology for Web3 data is efficiency, such as Dune’s use of LLM models for code anomaly detection and natural language conversion to generate SQL for information indexing; The model is pre-labeled with data, which can save a lot of labor costs. Nonetheless, VC agree that AI plays an auxiliary role in improving the capabilities and efficiency of Web3 data products, such as pre-annotation of data, which may ultimately require human review to ensure accuracy. **

(2) Use the advantages of LLM in adaptability and interaction to build AI Agent/Bot. **For example, large language models are used to retrieve data from the entire Web3, including on-chain data and off-chain news data, for information aggregation and public opinion analysis. Harper of Hashkey Capital believes that this type of AI agent is more inclined to the integration, generation, and interaction with users, and will be relatively weak in terms of information accuracy and efficiency.

Although there have been many cases of the application of the above two aspects, the technology and products are still in the early stage of exploration, so it is necessary to continuously optimize the technology and improve the products in the future.

(3) Using AI for pricing and trading strategy analysis: At present, there are projects in the market that use AI technology to estimate the price of NFTs, such as NFTGo invested by Qiming Venture Partners, and some professional trading teams use AI for data analysis and transaction execution. In addition, Ocean Protocol recently released a price prediction AI product. This type of product may seem imaginative, but it still needs to be verified in terms of product acceptance, user acceptance, and especially accuracy.

On the other hand, many VC, especially those who have invested in Web2 VC will pay more attention to the advantages and application scenarios that Web3 and blockchain technology can bring to AI technology. The openness, verifiability, and decentralization of blockchain, as well as the ability of cryptography to provide privacy protection, coupled with Web3’s reshaping of production relations, may be able to bring some new opportunities to AI:

(1) AI data ownership confirmation and verification. The advent of AI has made data content generation proliferation and cheap. **Tang Yi of Qiming Venture Partners mentioned that it is difficult to determine the quality and creator of content such as digital works. In this regard, the confirmation of data content requires a completely new system, and blockchain may be able to help. Zixi of Matrix Partners mentioned that there are data exchanges that put data in NFT for trading, which can solve the problem of data rights confirmation.

In addition, Yuxing of SevenX Ventures mentioned that Web3 data can improve the problem of AI fraud and black box, which currently has black box problems in both the model algorithm itself and the data, which can lead to skewed outputs. However, Web3 data is transparent, the data is open and verifiable, and the training sources and results of AI models will be clearer, making AI more fair and reducing bias and errors. However, the current amount of data in Web3 is not enough to empower the training of AI itself, so it will not be realized in the short term. But we can take advantage of this feature to put Web2 data on-chain to prevent AI deepfakes. **

(2) AI data annotation crowdsourcing and UGC community: At present, traditional AI annotation faces the problem of low efficiency and quality, especially in the field of professional knowledge, which may also require interdisciplinary knowledge, which is impossible to cover by traditional general data annotation companies, and often needs to be done internally by professional teams. The introduction of crowdsourcing for data annotation through the concepts of blockchain and Web3 can be a good way to improve this problem, such as Questlab invested by Matrix Partners, which uses blockchain technology to provide crowdsourcing services for data annotation. In addition, in some open-source model communities, the blockchain concept can also be used to solve the problem of model creator economy.

(3) Data privacy deployment: Blockchain technology combined with cryptography-related technologies can ensure data privacy and decentralization. Zixi of Matrix Partners mentioned that they have invested in a synthetic data company that generates synthetic data through large models, which can be mainly used in software testing, data analysis, and AI large model training. Companies are involved in a lot of privacy deployment issues when processing data, and using the Oasis blockchain can effectively avoid privacy and regulatory issues.

2.2AI+Web3How to build core competitiveness of data companies

For Web3 technology companies, the introduction of AI can increase the attractiveness or attention of the project to a certain extent, but at present, most of the products related to Web3 technology companies combined with AI are not enough to become the core competitiveness of the company, but more to provide a more user-friendly experience and improve efficiency. For example, the threshold for AI agents is not high, and the company that does it first may have a first-mover advantage in the market, but it does not create barriers. **

What really generates core competitiveness and barriers in the Web3 data industry should be the team’s data capabilities and how to apply AI technology to solve problems in specific analysis scenarios. **

First of all, the team’s data capabilities include the data source and the team’s ability to analyze data and adjust the model, which is the basis for subsequent work. In the interview, SevenX Ventures, Matrix Partners, and Hashkey Capital all unanimously mentioned that the core competitiveness of AI+Web3 data companies depends on the quality of data sources. On top of this, engineers are also required to be able to skillfully fine-tune models, process data, and parse based on data sources.

On the other hand, the specific combination of the team’s AI technology is also very important, and the scenario should be valuable. **Harper believes that although the current combination of Web3 data companies and AI basically starts with AI agents, their positioning is also different, such as Space and Time, which Hashkey Capital invested, and chainML cooperated to launch the infrastructure for creating AI agents, in which the created DeFi agents are used for Space and Time.

2.3** Web3 **** Data Company Future Commercialization Road**

Another topic that is important for Web3 data companies is commercialization. For a long time, the profit model of data analysis companies has been relatively simple, most of them are ToC free, and the main ToB is profitable, which depends on the willingness of B-end customers to pay. In the field of Web3, the willingness of enterprises to pay is not high, and the industry startups are the mainstay, so it is difficult for project parties to support long-term payment. As a result, Web3 data companies are currently in a difficult position to commercialize.

On this issue, VC generally believe that the combination of current AI technology is only used to solve the problem of production process internally, and does not change the inherent problem of difficulty in monetization. Some new product forms, such as AI bots, do not have a high enough threshold, which may enhance users’ willingness to pay in the toC field to a certain extent, but they are still not very strong. AI may not be the solution to the problem of data product commercialization in the short term, and commercialization requires more productization efforts**, such as finding more suitable scenarios and innovative business models.

On the path of combining Web3 and AI in the future, the use of Web3’s economic model combined with AI data may lead to some new business models, mainly in the field of ToC. Zixi of Matrix Partners mentioned that AI products can be combined with some token gameplay to improve the stickiness, daily activity and emotion of the entire community, which is feasible and easier to monetize. Tang Yi of Qiming Venture Capital mentioned that from an ideological point of view, the value system of Web3 can be combined with AI, which is very suitable as an account system or value transformation system for bots. For example, a bot has its own account and can earn money through its smart part, as well as pay for maintaining its underlying computing power, etc. But this concept belongs to the imagination of the future, and practical application may still have a long way to go.

In the original business model, that is, the direct payment of users, it is necessary to have strong enough product power to allow users to have a stronger willingness to pay. For example, higher quality data sources, the benefits of data outweigh the costs paid, etc., not only in the application of AI technology, but also in the capabilities of the data team itself.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)