Microsoft’s API Augmented Gorilla AI Outperforms ChatGPT and Others

August 8, 2023

4 Mins Read

Recent advancements in large language models (LLMs) have brought about significant changes in the field, equipping them with new capabilities like natural conversation, mathematical reasoning, and program synthesis.

However, LLMs still have some inherent limitations. Fixed weights restrict their ability to store information, and their computation capabilities are confined to a static graph and limited context.

Moreover, as the world evolves, LLMs require retraining to update their knowledge and reasoning skills. To overcome these limitations, researchers have started enhancing LLMs with tools. LLMs can utilize search technologies, databases, and computational tools by giving them access to extensive and dynamic knowledge bases and enabling complex computational tasks.

Recently, researchers from UC Berkeley and Microsoft introduced Gorilla, a LLaMA-7B model explicitly designed for API calls. Gorilla relies on self-instruction fine-tuning and retrieval techniques to enable LLMs to accurately select from a large and constantly changing set of tools provided through their APIs and documentation.

🚀 Exciting News in the AI World! 🚀

Meet Gorilla AI, a groundbreaking project by Microsoft and UC Berkeley. This open-source AI model is taking the tech world by storm with its ability to autonomously browse the web, learn from new tools, and interact with various APIs.

🔑 Key… pic.twitter.com/6ZIx60OTNZ
— Thimios (@thimiosmor) August 6, 2023

The authors create a substantial corpus of APIs, known as APIBench, by gathering machine learning APIs from major model hubs like TorchHub, TensorHub, and HuggingFace. Using self-instruction, they generate pairs of instructions and corresponding APIs.

The fine-tuning process involves converting the data into a user-agent chat-style conversation format and performing standard instruction fine-tuning on the base LLaMA-7B model.

Example API calls generated by GPT-4, Claude, and Gorilla for the given prompt. In this example, GPT-4 presents a model that doesn’t exist, and Claude picks an incorrect library. In contrast, our model, Gorilla, can identify the task correctly and suggest a fully-qualified API call.

Image description: Gorilla AI showing the correct API

API calls frequently present limitations, which introduce intricacies in the LLM’s understanding and organization of the calls.

For instance, a prompt might demand the LLM to use an image classification model with precise parameter size and accuracy limitations. These difficulties underscore the importance for LLMs to grasp not only the functional explanation of an API call but also to deduce its inherent constraints.

Data Set to look for

The dataset is focused on technology and comprises three domains: Torch Hub, Tensor Hub, and HuggingFace. Each part brings in a wealth of information, showcasing the dataset’s diverse nature.

An additional effort has been made to enhance the dataset’s value and usefulness. Each API in the dataset is accompanied by a carefully crafted set of 10 unique instructions. These instructions are essential guides for both training and evaluation purposes.

This initiative ensures that every API goes beyond simple representation, allowing for more robust usage and analysis.

Retriever-aware training

Gorilla introduces the concept of retriever-aware training, where the instruction-tuned dataset includes an extra field with retrieved API documentation for reference. This approach aims to teach the LLM to understand and answer questions based on the provided documentation.

The authors demonstrate that this technique enables the LLM to adapt to changes in API documentation, leading to improved performance and reduced errors.

During inference, users provide prompts in natural language. Gorilla operates in two modes:

zero-shot and
retrieval.

In a zero-shot manner, the prompt is directly given to the Gorilla LLM model, which suggests the appropriate API call to achieve the task or goal. In retrieval mode, the retriever (either BM25 or GPT-Index) retrieves the most up-to-date API documentation from the API Database.

This documentation is combined with the user prompt and a message indicating the reference to the API documentation. The combined input is then fed to Gorilla, which outputs the API to be used. Beyond the concatenation step, no further prompt tuning is performed in this system.

Gorilla is a LLM that can provide appropriate API calls. It is trained on three massive machine learning hub datasets: Torch Hub, TensorFlow Hub and HuggingFace. We are rapidly adding new domains, including Kubernetes, GCP, AWS, OpenAPI, and more. Zero-shot Gorilla outperforms GPT-4, Chat-GPT and Claude. Gorilla is extremely reliable, and significantly reduces hallucination errors.

Image description: Gorilla is an LLM that can provide appropriate API calls.

Tree-matching strategy

Inductive program synthesis has succeeded in various domains by creating programs satisfying specific test cases. However, when evaluating API calls, relying solely on test cases falls short because verifying if the code is semantically correct becomes challenging.

To evaluate the model’s performance, a comparison of their functional equivalence is done using the collected dataset. To identify the API called by the LLM in the dataset, an AST (Abstract Syntax Tree) tree-matching strategy is used. By checking if the AST of a candidate API call is a sub-tree of the reference API call, it becomes possible to trace which API is being used.

Defining and identifying hallucinations presents a significant challenge. “The LAST matching process helps identify hallucinations directly. In this context, a hallucination refers to an API call that is not a sub-tree of any API in the database, essentially invoking an entirely imagined tool.”

It’s important to note that this definition of a hallucination differs from invoking an API incorrectly, which is defined as an error.

AST sub-tree matching is crucial in identifying the specific API being called within the dataset. Since API calls can have multiple arguments, each of these arguments needs to be matched. Additionally, considering that Python allows for default arguments, defining which statements to check for each API in the database is essential.

Is AGI on the way?

Gorilla is embarking on limitless workings unlike AI models. Is there a time that AGI is on the verge of a takeover? The questions are many but various answers too. What do you think about it?

Let us know in the comments below.

Marrium Akhtar

August 8, 2023

9 months ago

Marrium is a dedicated digital Marketer and an SEO enthusiast who is skilled in cracking SEO codes. Other than work, she loves to stream, eat, and repeat.

Have Your Say!!

Cookie	Duration	Description
__stripe_mid	1 year	This cookie is set by Stripe payment gateway. This cookie is used to enable payment on the website without storing any patment information on a server.
__stripe_sid	30 minutes	This cookie is set by Stripe payment gateway. This cookie is used to enable payment on the website without storing any patment information on a server.
Affiliate ID	3 months	Affiliate ID cookie
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
Data 1	3 months
Data 2	3 months	Data 2
JSESSIONID	session	Used by sites written in JSP. General purpose platform session cookies that are used to maintain users' state across page requests.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
woocommerce_cart_hash	session	This cookie is set by WooCommerce. The cookie helps WooCommerce determine when cart contents/data changes.
XSRF-TOKEN	session	The cookie is set by Wix website building platform on Wix website. The cookie is used for security purposes.

Cookie	Duration	Description
__lc_cid	2 years	This is an essential cookie for the website live chat box to function properly.
__lc_cst	2 years	This cookie is used for the website live chat box to function properly.
__lc2_cid	2 years	This cookie is used to enable the website live chat-box function. It is used to reconnect the customer with the last agent with whom the customer had chatted.
__lc2_cst	2 years	This cookie is necessary to enable the website live chat-box function. It is used to distinguish different users using live chat at different times that is to reconnect the last agent with whom the customer had chatted.
__oauth_redirect_detector		This cookie is used to recognize the visitors using live chat at different times inorder to optimize the chat-box functionality.
Affiliate ID	3 months	Affiliate ID cookie
Data 1	3 months
Data 2	3 months	Data 2
pll_language	1 year	This cookie is set by Polylang plugin for WordPress powered websites. The cookie stores the language code of the last browsed page.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_ga_J2RWQBT0P2	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_12584548_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gat_UA-12584548-1	1 minute	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
_gcl_au	3 months	This cookie is used by Google Analytics to understand user interaction with the website.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
_hjAbsoluteSessionInProgress	30 minutes	No description available.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
_hjid	1 year	This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_hjIncludedInPageviewSample	2 minutes	No description available.
_hjIncludedInSessionSample	2 minutes	No description available.
_hjTLDTest	session	No description available.
PAPVisitorId	1 year	This cookie is set by the Post Affiliate Pro.This cookie is used to store the visitor ID which helps in tracking the affiliate.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to deliver advertisement when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
fr	3 months	The cookie is set by Facebook to show relevant advertisments to the users and measure and improve the advertisements. The cookie also tracks the behavior of the user across the web on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
NID	6 months	This cookie is used to a profile based on user's interest and display personalized ads to the users.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.

Cookie	Duration	Description
_app_session	1 month	No description available.
_dc_gtm_UA-12584548-1	1 minute	No description
_gfpc	session	No description available.
71cfb2288d832330cf35a9f9060f8d69	session	No description
cli_bypass	3 months	No description
CONSENT	16 years 6 months 13 days 18 hours	No description
gtm-session-start	2 hours	No description available.
isoCode	1 month	No description available.
L-k26wU	1 day	No description
L-KVHA4	1 day	No description
m	2 years	No description available.
newVisitorId	3 months	No description
owner_token	1 day	No description available.
PP-k26wU	1 hour	No description
PP-KVHA4	1 hour	No description
RL-k26wU	1 day	No description
RL-KVHA4	1 day	No description
wisepops	2 years	No description available.
wisepops_session	session	No description available.
wisepops_visits	2 years	No description available.
woocommerce_items_in_cart	session	No description available.
wp_woocommerce_session_1b44ba63fbc929b5c862fc58a81dbb22	2 days	No description
yt-remote-connected-devices	never	No description available.
yt-remote-device-id	never	No description available.