WordPress Ad Banner

Microsoft Unveils Fine-Tuning for Phi-3

Microsoft has been a key supporter and partner of OpenAI, but it’s clear that the tech giant is not content to let OpenAI dominate the generative AI landscape. In a significant move, Microsoft has introduced a new way to fine-tune its Phi-3 small language model without the need for developers to manage their own servers, and it’s available for free initially.

What is Phi-3?

It is a 3 billion parameter model launched by Microsoft in April. It serves as a low-cost, enterprise-grade option for third-party developers looking to build new applications and software. Despite its smaller size compared to other leading language models, Its performs on par with OpenAI’s GPT-3.5 model. It is designed for coding, common sense reasoning, and general knowledge tasks, making it an affordable and efficient choice for developers.

The Phi-3 Family

The Phi-3 family includes six models with varying parameters and context lengths, ranging from 4,000 to 128,000 tokens per input. Costs range from $0.0003 to $0.0005 per 1,000 input tokens, equating to $0.3 to $0.9 per 1 million tokens. This makes a cost-effective alternative to OpenAI’s GPT-4o mini.

Serverless Fine-Tuning

Microsoft’s new Models-as-a-Service (serverless endpoint) in its Azure AI development platform allows developers to fine-tune Phi-3-small without managing infrastructure. Phi-3-vision, capable of handling imagery inputs, will soon be available via a serverless endpoint as well. For custom-tuned models, Phi-3-mini and Phi-3-medium can be fine-tuned with third-party data.

Benefits and Use Cases

Phi-3 models are ideal for various scenarios, such as learning new skills, improving response quality, and more. For instance, Khan Academy uses a fine-tuned Phi-3 model to benchmark its Khanmigo for Teachers, powered by Microsoft’s Azure OpenAI Service.

Pricing and Competition

Serverless fine-tuning of Phi-3-mini-4k-instruct starts at $0.004 per 1,000 tokens ($4 per 1 million tokens). This positions Microsoft as a strong competitor to OpenAI, which recently offered free fine-tuning of GPT-4o mini for certain users.

Unveiling the Stack Overflow 2024 Developer Survey

In a revealing snapshot of the global software development ecosystem, the developer knowledge platform Stack Overflow has released a new report that delves into the intricate relationship between artificial intelligence (AI) and the coding community. The Stack Overflow 2024 Developer Survey provides a wealth of insights into how generative AI (gen AI) is reshaping the tech landscape and its impact on developers worldwide.

Key Findings from the 2024 Developer Survey

Stack Overflow’s 2024 Developer Survey is based on responses from more than 65,000 developers across 185 countries. This extensive survey highlights the following key points:

  • AI Tool Usage: AI tool usage among developers increased to 76% in 2024, up from 70% in 2023.
  • AI Favorability: Despite increased usage, AI favorability decreased from 77% to 72%.
  • Trust in AI: Only 43% of respondents trust the accuracy of AI tools.
  • Productivity Boost: 81% of developers cite increased productivity as the top benefit of AI tools.
  • Ethical Concerns: Misinformation emerges as the top AI-related ethical concern (79%).
  • Job Security: 70% of professional developers don’t see AI as a threat to their jobs.

The Role of Gen AI in the Developer Community

Increasing Developer Numbers

Contrary to some fears that gen AI might replace developers, it appears that gen AI is actually increasing the number of developers rather than reducing the need for them. Ryan Polk, Chief Product Officer at Stack Overflow, believes that gen AI will democratize coding and significantly grow the developer community.

Enhancing Developer Productivity

Gen AI coding tools are seen as beneficial to developers in their daily tasks. AI-powered code generators, for instance, can reduce the time spent on boilerplate code, allowing developers to focus on more complex problems. Polk describes this as a “Better Together” approach, where gen AI tools complement resources like Stack Overflow to provide a powerful combination.

Trust and Ethical Concerns

Declining Favorability

One of the declining metrics in the 2024 report is the favorability of gen AI tools. In 2023, 77% of respondents had a favorable view of these tools, which fell to 72% in 2024. Senior analyst Erin Yepis suggests that more developers trying these tools and being disappointed in their experiences might be a contributing factor.

Trust Issues

A significant concern among developers is the lack of trust in gen AI tools, primarily due to AI hallucination issues. The top ethical concerns include AI’s potential to spread misinformation (79%), missing or incorrect attribution for sources of data (65%), and bias that doesn’t represent a diversity of viewpoints (50%).

The Role of Stack Overflow

Stack Overflow and its community play a crucial role in addressing trust issues in gen AI. Polk emphasizes that user trust in data, technology, and community knowledge is vital for AI’s future success. Stack Overflow’s partnerships with AI and cloud companies, such as Google Cloud and OpenAI, aim to set new standards with vetted, trusted, and accurate data.

Conclusion

The 2024 Developer Survey by Stack Overflow reveals a complex yet promising landscape where gen AI and developers coexist and collaborate. While there are challenges related to trust and ethical concerns, the potential for increased productivity and growth in the developer community is significant. As gen AI continues to evolve, the collaboration between AI tools and developer communities like Stack Overflow will be essential in shaping a responsible and innovative future for software development.

ElevenLabs New AI Voice Isolator

ElevenLabs, the AI voice startup known for its voice cloning, text-to-speech, and speech-to-speech models, has just launched a new tool: an AI Voice Isolator. Now available on the ElevenLabs platform, this tool allows creators to remove unwanted ambient noise and sounds from any content, including films, podcasts, and YouTube videos.

A New Tool in the Creative Arsenal

The AI Voice Isolator arrives shortly after ElevenLabs released its Reader app. While the tool is free to use with some limitations, it’s worth noting that enhancing speech quality is not a novel capability. Many other providers, including Adobe, offer similar tools. However, the true test will be how well Voice Isolator performs compared to these existing solutions.

How Does the AI Voice Isolator Work?

Creators often face the challenge of background noise when recording content like films, podcasts, or interviews. These noises can interfere with the final output, diminishing the quality of the recorded speech. Traditional solutions, such as using microphones with ambient noise cancellation, can be costly and out of reach for early-stage creators with limited resources.

This is where the AI Voice Isolator steps in. During the post-production stage, users upload the content they want to enhance. The tool then processes the file, detects and removes unwanted noise, and extracts clear dialogue. ElevenLabs claims the product can deliver speech quality comparable to studio recordings. In a demo, the company’s head of design, Ammaar Reshi, showcased how the tool effectively removed the noise of a leaf blower, leaving crystal-clear speech.

Real-World Testing

We conducted three tests to evaluate the Voice Isolator’s real-world applicability. In the first test, we spoke three separate sentences with different background noises. The other two tests involved sentences with a mix of various noises occurring randomly.

In every case, the tool processed the audio within seconds. It successfully removed noises like door openings and closings, table banging, clapping, and household item movements, extracting clear speech without distortion. However, it struggled with wall banging and finger snapping sounds.

Limitations and Future Improvements

Sam Sklar, who handles growth at ElevenLabs, noted that the tool does not currently work on music vocals, although users are encouraged to experiment with it.

While the AI Voice Isolator’s ability to remove irregular background noise is impressive, there is still room for improvement. Ongoing enhancements are expected, as with other tools. However, details about the underlying models powering the tool and whether recordings are used for training remain undisclosed. Users can opt-out of data use for training via a form linked in the company’s privacy policy.

Access and Pricing

Currently, the Voice Isolator is available only through the ElevenLabs platform, with plans to open API access in the coming weeks. Free access is available with certain usage limits—10k characters per month, translating to approximately 10 minutes of audio per month. For larger audio files, paid plans start at $5/month.

Nvidia Unveils New RTX Technology to Power AI Assistants and Digital Humans

Nvidia is once again pushing the boundaries of technology with its latest RTX advancements, designed to supercharge AI assistants and digital humans. These innovations are now integrated into the newest GeForce RTX AI laptops, setting a new standard for performance and capability.

Introducing Project G-Assist

At the forefront of Nvidia’s new technology is Project G-Assist, an RTX-powered AI assistant demo that provides context-aware assistance for PC games and applications. This innovative technology was showcased with ARK: Survival Ascended by Studio Wildcard, illustrating its potential to transform gaming and app experiences.

Nvidia NIM and the ACE Digital Human Platform

Nvidia also launched its first PC-based Nvidia NIM (Nvidia Inference Microservices) for the Nvidia ACE digital human platform. These announcements were made during CEO Jensen Huang’s keynote at the Computex trade show in Taiwan. Nvidia NIM enables developers to reduce deployment times from weeks to minutes, supporting natural language understanding, speech synthesis, and facial animation.

The Nvidia RTX AI Toolkit

These advancements are supported by the Nvidia RTX AI Toolkit, a comprehensive suite of tools and SDKs designed to help developers optimize and deploy large generative AI models on Windows PCs. This toolkit is part of Nvidia’s broader initiative to integrate AI across various platforms, from data centers to edge devices and home applications.

New RTX AI Laptops

Nvidia also unveiled new RTX AI laptops from ASUS and MSI, featuring up to GeForce RTX 4070 GPUs and energy-efficient systems-on-a-chip with Windows 11 AI PC capabilities. These laptops promise enhanced performance for both gaming and productivity applications.

Advancing AI-Powered Experiences

According to Jason Paul, Vice President of Consumer AI at Nvidia, the introduction of RTX Tensor Core GPUs and DLSS technology in 2018 marked the beginning of AI PCs. With Project G-Assist and Nvidia ACE, Nvidia is now pushing the boundaries of AI-powered experiences for over 100 million RTX AI PC users.

Project G-Assist in Action

AI assistants like Project G-Assist are set to revolutionize gaming and creative workflows. By leveraging generative AI, Project G-Assist provides real-time, context-aware assistance. For instance, in ARK: Survival Ascended, it can help players by answering questions about creatures, items, lore, objectives, and more. It can also optimize gaming performance by adjusting graphics settings and reducing power consumption while maintaining performance targets.

Nvidia ACE NIM: Powering Digital Humans

The Nvidia ACE technology for digital humans is now available for RTX AI PCs and workstations, significantly reducing deployment times and enhancing capabilities like natural language understanding and facial animation. At Computex, the Covert Protocol tech demo, developed in collaboration with Inworld AI, showcased Nvidia ACE NIM running locally on devices.

Collaboration with Microsoft: Windows Copilot Runtime

Nvidia and Microsoft are working together to enable new generative AI capabilities for Windows apps. This collaboration will allow developers to access GPU-accelerated small language models (SLMs) that enable retrieval-augmented generation (RAG) capabilities. These models can perform tasks such as content summarization, content generation, and task automation, all running efficiently on Nvidia RTX GPUs.

The RTX AI Toolkit: Faster and More Efficient Models

The Nvidia RTX AI Toolkit offers tools and SDKs for customizing, optimizing, and deploying AI models on RTX AI PCs. This includes the use of QLoRa tools for model customization and Nvidia TensorRT for model optimization, resulting in faster performance and reduced RAM usage. The Nvidia AI Inference Manager (AIM) SDK simplifies AI integration for PC applications, supporting various inference backends and processors.

AI Integration in Creative Applications

Nvidia’s AI acceleration is being integrated into popular creative apps from companies like Adobe, Blackmagic Design, and Topaz. For example, Adobe’s Creative Cloud tools are leveraging Nvidia TensorRT to enhance AI-powered capabilities, delivering unprecedented performance for creators and developers.

RTX Remix: Enhancing Classic Games

Nvidia RTX Remix is a platform for remastering classic DirectX 8 and 9 games with full ray tracing and DLSS 3.5. Since its launch, it has been used by thousands of modders to create stunning game remasters. Nvidia continues to expand RTX Remix’s capabilities, making it open source and integrating it with popular tools like Blender and Hammer.

AI for Video and Content Creation

Nvidia RTX Video, an AI-powered super-resolution feature, is now available as an SDK for developers, allowing them to integrate AI for upscaling, sharpening, and HDR conversion into their applications. This technology will soon be available in video editing software like DaVinci Resolve and Wondershare Filmora, enabling video editors to enhance video quality significantly.

Conclusion

Nvidia’s latest advancements in RTX technology are set to revolutionize AI assistants, digital humans, and content creation. By providing powerful tools and capabilities, Nvidia continues to push the boundaries of what AI can achieve, enhancing user experiences across gaming, creative applications, and beyond.

Stay updated with the latest in AI and RTX technology by subscribing to our blog and sharing this post on social media. Join the conversation and explore the future of AI with Nvidia!

What is Artificial Intelligence?

Artificial Intelligence (AI) has become a buzzword in recent years, but what does it really mean? This blog post will delve into the basics of AI, how it works, what it can and can’t do, potential pitfalls, and some of the most intriguing aspects of this technology.

Introduction to Artificial Intelligence (AI)

Artificial Intelligence, commonly referred to as AI, is the simulation of human intelligence in machines. These machines are programmed to think and learn like humans, capable of performing tasks that typically require human intelligence such as visual perception, speech recognition, decision-making, and language translation. AI can be found in various applications today, from self-driving cars to voice-activated assistants like Siri and Alexa.

The Inner Workings of AI and Its Comparison to a Hidden Octopus

AI systems work by using algorithms and large datasets to recognize patterns, make decisions, and improve over time. These systems are typically powered by machine learning, a subset of AI that enables machines to learn from experience. Here’s a simplified breakdown of how AI works: Data is collected from various sources, then processed to be clean and usable. Algorithms are applied to this data to identify patterns and make predictions. The AI system is trained using a training dataset, improving its accuracy over time through learning. Finally, the trained AI system is deployed and continues to learn and improve based on feedback.

Think of AI as a secret octopus with many tentacles, each representing a different capability. Just as an octopus uses its tentacles to explore and interact with its environment, AI uses its various functions (like vision, speech, and decision-making) to understand and influence the world around it. The “secret” part comes from the fact that, much like an octopus’s intricate movements can be hard to decipher, the inner workings of AI algorithms can be complex and opaque, often functioning in ways that are not immediately understandable to humans.

What AI Can (and Can’t) Do

AI can analyze vast amounts of data quickly and accurately, recognize patterns, and make predictions based on this data. It can automate repetitive tasks, improving efficiency and reducing errors. Through natural language processing (NLP), AI can understand and generate human language, enabling applications like chatbots and language translators. AI can also identify objects in images and understand spoken language, powering technologies like facial recognition and virtual assistants. However, AI lacks the ability to understand context in the way humans do and cannot genuinely understand or replicate human emotions. While AI can generate content, it does not possess true creativity or original thought. Additionally, AI cannot make ethical decisions as it does not understand morality.

How AI Can Go Wrong

AI systems are not infallible and can go wrong in several ways. AI can perpetuate and amplify biases present in training data, leading to unfair or discriminatory outcomes. Incorrect data or flawed algorithms can result in erroneous predictions or decisions. AI systems can also be susceptible to hacking and malicious manipulation. Over-reliance on AI can lead to the erosion of human skills and judgment.

The Importance (and Danger) of Training Data

Training data is crucial for AI systems as it forms the foundation upon which they learn and make decisions. High-quality, diverse training data helps create accurate and reliable AI systems. However, poor-quality or biased training data can lead to inaccurate, unfair, or harmful AI outcomes. Ensuring that training data is representative and free from bias is essential to developing fair and effective AI systems.

How a ‘Language Model’ Makes Images

Language models, like OpenAI’s GPT-3, are primarily designed to process and generate text. However, they can also be used to create images when integrated with other AI models. The language model receives a text prompt describing the desired image. The model interprets the text and generates a detailed description of the image. A connected image-generating AI, such as DALL-E, uses the description to create an image. This process involves complex neural networks and vast datasets to accurately translate textual descriptions into visual representations.

What About AGI Taking Over the World?

Artificial General Intelligence (AGI) refers to a level of AI that can understand, learn, and apply knowledge across a wide range of tasks at a human-like level. While AGI is a fascinating concept, it remains largely theoretical. AGI does not yet exist and is a long way from being realized. The idea of AGI taking over the world is a popular theme in science fiction, but it raises legitimate concerns about control, ethics, and safety. Ensuring that AGI, if developed, is aligned with human values and controlled appropriately is crucial to preventing potential risks.

Conclusion

AI is a powerful technology with the potential to revolutionize various aspects of our lives. Understanding how it works, its capabilities and limitations, and the importance of training data is crucial to harnessing its benefits while mitigating its risks. As AI continues to evolve, it is essential to stay informed and engaged with its development to ensure it serves humanity positively and ethically.

Exploring the Power of Copilot+ PC in Microsoft’s AI Computing Vision

Microsoft has unveiled a bold vision for the future of personal computing, one where artificial intelligence (AI) plays a central role. At the heart of this vision is Microsoft Copilot, a revolutionary platform designed to anticipate user needs and enhance productivity. This article explores the implications of Copilot and the new Copilot+ PC, detailing how they promise to transform both consumer and enterprise computing landscapes.

Microsoft Copilot: The Future of AI in Computing

Anticipating User Needs

Satya Nadella, Microsoft’s chief executive, emphasized the transformative potential of AI during the announcement. He described Copilot as a tool that not only understands user intentions but also proactively assists them. “We’re entering a new era where computers not only understand us but can anticipate our needs,” Nadella remarked, highlighting how Copilot integrates knowledge and expertise across devices and industries.

Empowering Users

Copilot is designed to empower individuals and organizations by providing instant access to information and facilitating creativity and productivity. “Copilot is empowering every person and every organization to be more knowledgeable, productive, and connected,” Nadella stated. This empowerment is achieved through advanced AI capabilities embedded in the new Copilot+ PC.

The Copilot+ PC: A New Category of Devices

The Copilot+ PC is a new category of AI-infused computers produced by leading manufacturers such as Acer, ASUS, Dell, HP, Lenovo, and Samsung. These devices are equipped with cutting-edge AI models, including OpenAI’s GPT-4o, and feature a powerful Neural Processing Unit (NPU) capable of over 40 trillion operations per second (TOPS). Additionally, they run on a re-architected Windows 11 operating system optimized for performance and battery life.

Copilot+ PCs Have Great Potential

Exclusive Features

Copilot+ PCs come with several exclusive features:

  • Recall: Acts as a photographic memory, accessing virtual records of past activities.
  • Live Captions: Provides real-time translations in video chats from multiple languages into English.
  • Image Co-Creation: Capable of generating images from doodles or text prompts.
Microsoft's Yusef Mehdi explains the next-generation of Windows AI PCs, outlining three key components that will be a part of every device.

The Impact on Enterprises

Transforming the Enterprise Landscape

Copilot’s introduction raises several questions about its impact on businesses. Analysts like Anshel Sag from Moor Insights & Strategy believe that Copilot sets a new standard for AI in enterprises. Organizations can now optimize their systems for Copilot, enhancing both familiarity and demand for AI-integrated solutions.

Enhancing Productivity

Copilot+ PCs aim to solve the “blank page” problem, helping knowledge workers jumpstart their creativity and productivity. Seth Juarez, Microsoft’s principal program manager for AI, explains that Copilot can accelerate the creative process, making workers more productive by moving quickly from ideation to execution.

Augment, Accelerate, or Automate

Ray Wang from Constellation Research outlines five trends that AI PCs will introduce:

  1. Augmentation: Computers will perform tasks previously unimaginable.
  2. Acceleration: Faster decision-making through rapid information assimilation.
  3. Automation: AI will handle routine tasks, increasing efficiency.
  4. Advisement: AI will provide new types of advice and suggestions.
  5. Autonomous Systems: Although still in development, AI will eventually lead to self-sufficient systems.

Security Considerations

On-Device Protection

Security remains a top priority with the new Copilot+ PCs. These devices are Secure-Core PCs, featuring Microsoft’s Pluton security processor to protect sensitive data and ensure secure biometric sign-ins.

Local vs. Cloud AI

Sarah Bird, Microsoft’s chief product officer for responsible AI, emphasizes that AI security on local devices requires similar robust measures as cloud-based AI. The NPU’s capability ensures that on-device AI maintains high performance and security standards.

GPT-4o: OpenAI’s Latest Breakthrough in AI Technology

At the recent Spring Updates event, OpenAI’s Chief Technology Officer, Mira Murati, unveiled the latest breakthrough in AI technology – the GPT-4o multimodal foundation model. This innovative model, along with the introduction of the ChatGPT desktop app, marks a significant milestone for both free and paid users.

The Power of GPT-4o: Voice, Text, and Vision Integration

“It reasons across voice, text, and vision,” Murati exclaimed, highlighting the versatility of this new model. Notably, users will soon be able to capture real-time video through their ChatGPT smartphone apps, expanding the capabilities beyond text-based interactions.

Democratizing AI: Access for All Users

OpenAI aims to demystify AI technology by making it accessible to all users. With the release of GPT-4o, free users will no longer be limited to text-only interactions but will have access to powerful image and document analysis capabilities.

Rollout Plan: From Plus to Enterprise Users

While GPT-4o will eventually be available to all ChatGPT users, the rollout will begin with paying subscribers. Plus and Team users will enjoy increased message limits, with availability for Enterprise users on the horizon.

Real-Time Responses and Emotion Detection

One of the most exciting features of GPT-4o is its ability to respond in real-time across audio inputs, detect emotions, and adjust its voice accordingly. This functionality brings a new level of naturalism to AI interactions, similar to rival AI startup Hume.

Pricing and Performance: Enhanced Efficiency

In terms of pricing and performance, It offers compelling advantages. With half the price and double the speed of GPT-4 Turbo, along with increased rate limits, developers can expect a more efficient AI solution.

Embracing Innovation: User Adoption and Expectations

With over 100 million ChatGPT users and a thriving ecosystem of custom GPTs in the GPT Store, OpenAI is poised to revolutionize AI interactions. The recent confirmation that the mysterious “gpt2-chatbot” is indeed GPT-4o underscores the anticipation surrounding this new technology.

Conclusion: A New Era of AI Interaction

Despite some minor hiccups during live demos, the future looks promising for GPT-4o and the ChatGPT ecosystem. As users anticipate its widespread availability, the question remains: will GPT-4o redefine AI interactions and set a new standard for naturalistic experiences? Only time will tell.

OpenAI’s Model Sora unveiled First Music Video Generated

OpenAI sent shockwaves through the tech community and the arts scene earlier this year with the unveiling of their groundbreaking AI model, Sora. This innovative technology promises to revolutionize the creation of videos by producing realistic, high-resolution, and seamlessly smooth clips lasting up to 60 seconds each. Sora unveiled First Music Video, however, Sora’s debut has not been without controversy, stirring up concerns among traditional videographers and artists.

The Unveiling of Sora

In February 2024, OpenAI made waves by introducing Sora to a select audience. Although the technology remains unreleased to the public, OpenAI granted access to a small group of “red teamers” for risk assessment and a handpicked selection of visual artists, designers, and filmmakers. Despite this limited release, some early users have already begun experimenting with Sora, producing and sharing innovative projects.

The First Official Music Video with Sora

Among OpenAI’s chosen early access users is writer/director Paul Trillo, who recently made headlines by creating what is being hailed as the “first official music video made with OpenAI’s Sora.” Collaborating with indie chillwave musician Washed Out, Trillo crafted a mesmerizing 4-minute video for the single “The Hardest Part.” The video comprises a series of quick zoom shots seamlessly stitched together, creating the illusion of a continuous zoom effect.

Behind the Scenes

Trillo revealed that the concept for the video had been brewing in his mind for a decade before finally coming to fruition. He disclosed that the video consists of 55 separate clips generated by Sora from a pool of 700, meticulously edited together using Adobe Premiere.

Integration with Premiere Pro

Meanwhile, Adobe has expressed interest in incorporating Sora and other third-party AI video generator models into its Premiere Pro software. However, no timeline has been provided for this integration. Until then, users seeking to replicate Trillo’s workflow may need to generate AI video clips using third-party software like Runway or Pika before importing them into Premiere.

The Artist’s Perspective

In an interview with the Los Angeles Times, Washed Out expressed excitement about incorporating cutting-edge technology like Sora into his creative process. He highlighted the importance of exploring new tools and techniques to push the boundaries of artistic expression.

Power of Sora

Trillo’s use of Sora’s text-to-video capabilities underscores the technology’s potential in the creative landscape. By relying solely on Sora’s abilities, Trillo bypassed the need for traditional image inputs, showcasing the model’s versatility and power.

Embracing AI in Creativity

Trillo’s groundbreaking music video serves as a testament to the growing interest among creatives in harnessing AI tools to tell compelling stories. Despite criticisms of AI technology’s potential exploitation and copyright issues, many artists continue to explore its possibilities for innovation and expression.

Conclusion

As OpenAI continues to push the boundaries of AI technology with Sora, the creative community eagerly anticipates the evolution of storytelling and artistic expression in the digital age. Trillo’s pioneering work with Sora exemplifies the transformative potential of AI in the realm of media creation, paving the way for a new era of innovation and creativity.

Unleash the Power of AI with the Latest Update for Nvidia ChatRTX

Exciting news for AI enthusiasts! Nvidia ChatRTX introduces its latest update, now available for download. This update, showcased at GTC 2024 in March, expands the capabilities of this cutting-edge tech demo and introduces support for additional LLM models for RTX-enabled AI applications.

What’s New in the Update?

  • Expanded LLM Support: ChatRTX now boasts a larger roster of supported LLMs, including Gemma, Google’s latest LLM, and ChatGLM3, an open, bilingual LLM supporting both English and Chinese. This expansion offers users greater flexibility and choice.
  • Photo Support: With the introduction of photo support, users can seamlessly interact with their own photo data without the hassle of complex metadata labeling. Thanks to OpenAI’s Contrastive Language-Image Pre-training (CLIP), searching and interacting with personal photo collections has never been easier.
  • Verbal Speech Recognition: Say hello to Whisper, an AI automatic speech recognition system integrated into ChatRTX. Now, users can converse with their own data, as Whisper enables ChatRTX to understand verbal speech, enhancing the user experience.

Why Choose ChatRTX?

ChatRTX empowers users to harness the full potential of AI on their RTX-powered PCs. Leveraging the accelerated performance of TensorRT-LLM software and NVIDIA RTX, ChatRTX processes data locally on your PC, ensuring data security. Plus, it’s available on GitHub as a free reference project, allowing developers to explore and expand AI applications using RAG technology for diverse use cases.

Explore Further

For more details, check out the embargoed AI Decoded blog, where you’ll find additional information on the latest ChatRTX update. Additionally, don’t miss the new update for the RTX Remix beta, featuring DLSS 3.5 with Ray Reconstruction.

Don’t wait any longer—experience the future of AI with Nvidia ChatRTX today!

Introducing Ideogram Pro: Empowering Creators with Advanced AI Image Generation

As industry giants like Adobe and Meta continually enhance their AI image generation solutions, smaller startups are also stepping up their game to remain competitive. One such innovative player is Ideogram, a Toronto-based company founded by former Google Brain researchers. Today, Ideogram unveils its latest offering: the Pro subscription tier tailored for its most active and professional creators.

Ideogram’s New Pro Subscription Tier

Priced at $48 per month for annual billing (or $60 for monthly billing), the Pro tier complements Ideogram’s existing Free, Basic ($7 per month annually), and Plus ($16 per month annually) tiers. This addition underscores Ideogram’s commitment to catering to creators of all levels.

Ideogram Pro

Enhanced Features for Professional Creators

The Pro plan introduces several enhancements over Ideogram’s other paid options. Notably, subscribers gain the ability to submit up to 3,000 text prompts to Ideogram’s AI image generation platform. These prompts are prioritized for rapid image generation, with results delivered in less than 15 seconds. This means Pro subscribers can generate up to 12,000 images per month, providing unparalleled speed and efficiency.

Streamlined Workflow with Bulk Prompt Uploads

Ideogram is also revolutionizing the image generation process by allowing users to upload prompts in bulk via CSV format. This feature eliminates the need for manual input, saving time and streamlining the workflow for Pro subscribers.

Unique Capabilities and Commercial Usage

While Ideogram does not offer indemnification for enterprises like some of its competitors, it does permit commercial usage under its terms of service. Moreover, Ideogram stands out for its ability to generate stylized text and typography within images, making it a valuable tool for various commercial projects.

Join the Ideogram Pro Community Today

Creators eager to harness the power of Ideogram’s advanced AI image generation can sign up for the Pro plan starting today. Don’t miss out on the opportunity to elevate your creations with cutting-edge technology.

Conclusion

With Ideogram’s Pro subscription tier, creators gain access to a host of advanced features designed to enhance their image generation experience. From lightning-fast processing to streamlined workflows, Ideogram is empowering creators to unleash their creativity like never before. Join the Pro community today and discover the possibilities of AI-powered imagery.