Anthropic Unveils Claude 2: A Next-Level Chatbot Outperforming OpenAI's ChatGPT

Anthropic, a company founded by former OpenAI employees Daniela and Dario Amodei, has unveiled Claude 2, their latest general-purpose large language model. According to Anthropic, Claude 2 represents a significant improvement over existing chatbots like OpenAI’s ChatGPT. The company claims that Claude 2 is not only easy to converse with and provides clear explanations of its thinking but also has a reduced likelihood of producing harmful outputs and possesses a longer memory, as stated in a blog post.

This launch follows the introduction of the initial version of Claude by Anthropic four months ago, which received backing from Google. The models used in Claude have been trained on what Anthropic refers to as Constitutional AI, emphasizing their helpfulness and ability to politely decline responding to potentially harmful questions. Claude models excel in various tasks, including open-ended conversations, search queries, writing, editing, text summarization, coding, and providing valuable advice on a wide range of subjects.

Anthropic has also introduced a new public beta website called Claud.ai, which is currently available only in the United States and the United Kingdom. The chatbot has received positive feedback from developers, who claim that it surpasses the capabilities of GPT-4.

Notably, Claude 2 exhibits significant improvements over its predecessor, particularly in coding, mathematics, and reasoning. Anthropic highlighted these enhancements in their blog post, citing Claude 2’s superior performance in writing undergraduate and graduate-level entrance exams. Compared to Claude 1.3, the latest model achieved a score of 76.5 percent on the multiple-choice section of the Bar exam, up from 73.0 percent. Additionally, Claude 2 achieved above the 90th percentile on the GRE reading and writing exams, similar to the median applicant’s performance in quantitative reasoning among college students applying to graduate school.

Enhancements in Coding, Math, and Reasoning Skills

When comparing Claude 2 to Claude 1.3, one key difference is the expanded length of input and output. Anthropic increased Claude’s context window from 9,000 to 100,000 tokens, allowing the model to analyze extensive amounts of text, spanning hundreds or even thousands of pages. Conversations with the chatbot can now last hours or even days, with an input limit of approximately 75,000 words.

Anthropic also asserts that Claude 2 boasts improved coding skills. The blog post states that the model achieved a score of 71.2 percent, up from 56.0 percent, on the Codex HumanEval, a Python coding test. On GSM8k, a comprehensive collection of grade-school math problems, Claude 2 achieved a score of 88.0 percent, up from 85.2 percent. Anthropic plans to continue enhancing Claude 2’s capabilities and gradually deploy these improvements over the next few months.

Addressing a common challenge faced by chatbot technology, Anthropic has focused on enhancing the safety of Claude 2. The company claims that the latest version is “more harmless and less likely to generate offensive or dangerous output” when compared to Claude 1.3. Based on Anthropic’s internal red-teaming evaluation, which involves testing the models against a wide range of harmful prompts, Claude 2 demonstrated a twofold improvement in delivering harmless responses.

With the launch of Claude 2, Anthropic aims to provide a chatbot that excels in various tasks while prioritizing safety and user experience. The company’s commitment to continual improvement suggests that we can anticipate further advancements in the capabilities of this in the near future.

Enhancements in Coding, Math, and Reasoning Skills

You may also like these posts

Explore 14 Alternative LLMs Excluding ChatGPT

Unveiling Sora: OpenAI’s Groundbreaking AI Text-to-Video Model

End of Fake Sick Leave Applications, Al Can Now Detect Cold from a Person’s Voice