- Alibaba introduces ZeroSearch for generating AI training materials
- Could reduce costs by as much as 88%
- Requires extra GPUs to function
In a significant shift in AI development, Alibaba’s Tongyi Lab has unveiled a groundbreaking approach for creating AI search models that circumvents conventional search engines. This innovation is set to slash training costs by up to 88% when compared to the use of established commercial APIs such as those provided by Google.
A recent research paper, entitled “Incentivize the Search Capability of LLMs without Searching,” outlines how Alibaba utilizes AI-generated documents to simulate the outcomes traditionally obtained from actual search engines.
The team highlighted that employing these AI-generated documents could significantly elevate the quality of training. They noted that conventional search engines often produce documents of variable quality, which can lead to inconsistencies and, in turn, compromise the integrity of the data used for training.
Use of AI-generated documents to enhance training in AI search models
This pioneering approach offers substantial economic advantages. Specifically, operating the 14B version of ZeroSearch incurs costs of approximately $70.80 for every 64,000 queries. In stark contrast, utilizing Google’s APIs can set developers back around $586.70 for the same volume of queries. Lower-tier versions of ZeroSearch, such as the 7B and 3B models, reduce costs further to $35.40 and $17.70 per 64,000 queries, respectively, while maintaining comparable processing speeds to Google’s solutions.
Nevertheless, Alibaba has brought attention to a significant consideration: the ZeroSearch technique requires users to have access to one, two, or even four A100 GPUs for its operation, while Google’s APIs can be utilized without such GPU support. This dependency on GPUs raises valid concerns regarding sustainability and environmental ramifications, particularly in terms of energy consumption.
The researchers candidly addressed these challenges, stating, “Our methodology presents inherent difficulties. Utilizing the simulated search LLM mandates reliable GPU infrastructure. While it tends to be more cost-effective than commercial APIs, it entails additional infrastructure costs.”
Despite these hurdles, Alibaba’s alternative points to a promising new direction for AI development that reduces dependence on expensive and restrictive platforms like Google’s Search APIs, potentially opening doors to a more democratic and accessible AI landscape.