Microsoft Unveils New AI Features for Azure at Inspire Conference

During its annual Inspire conference, Microsoft made several significant announcements regarding new AI features coming to Azure, with Vector Search being the most noteworthy. Vector Search, available as a preview through Azure Cognitive Search, utilizes machine learning to capture the meaning and context of unstructured data, such as images and text, in order to enhance search efficiency.

Vectorization, an increasingly popular technique in the field of search, involves converting words or images into vectors, which are numerical representations encoding their meaning. This enables mathematical processing of the vectors, allowing machines to structure and comprehend data. For example, vectors facilitate understanding that closely related words in “vector space,” such as “king” and “queen,” are connected, enabling quick retrieval from vast databases of words.

Companies like Qdrant and SeMI Technologies utilize vector search to power their database services, as do tech giants like Amazon and Google.

Microsoft’s vector search offers three main capabilities: “pure” vector search, hybrid retrieval, and “sophisticated” reranking. The company highlights that it can be applied in various apps and services to generate personalized responses in natural language, deliver product recommendations, and identify data patterns.

In a blog post, Microsoft explains, “Vector search is integrated with Azure AI, allowing customers to build search-enabled, chat-based apps, convert images into vector representations using Azure AI Vision, and retrieve relevant information from large data sets to help automate processes and workflows. The integration of Vector search seamlessly extends to other capabilities of Azure Cognitive Search, including faceted navigation, filters, and more.”

In addition to Vector Search, Microsoft introduced the Document Generative AI solution as part of Azure’s offerings. This solution integrates Microsoft’s existing AI-powered document processing services, including Azure Form Recognizer, with the Azure OpenAI Service. Leveraging OpenAI’s latest AI language models, the Document Generative AI solution enables tasks such as report summarization, value extraction, knowledge mining, and the generation of new types of documents. Essentially, it allows companies to build applications similar to OpenAI’s ChatGPT that can read documents and base their responses on the content.

For instance, using the Document Generative AI, customers can upload invoices, bills, and contracts, enabling employees to ask questions about service guarantees and specific line items. The solution provides answers in text, as well as images and tables, and offers citations with links to the source content.

Microsoft elaborates, “[Using the Document Generative AI solution, you can] interact with documents using natural language and generate new content from your existing documents, including blog posts, newsletters, summaries, and captions… Whether you require intelligent document chat capabilities, writing assistance, query support, comprehensive search functionality, or even document translation, Document Generative AI can handle complex and diverse document tasks through models from OpenAI.”

Furthermore, Microsoft announced that OpenAI’s Whisper model, an automatic speech recognition model, will soon be available on the Azure OpenAI Service and Microsoft’s suite of AI speech services. Enterprise customers will be able to utilize Whisper to transcribe and translate audio content and produce batch transcriptions at scale.

Completing the AI unveilings at the Inspire conference, Microsoft introduced the public preview of Real-time Diarization, an AI-driven speech service that can identify speakers in real-time when multiple people are speaking. The company also announced the general availability of Custom Neural Voice, which employs AI to accurately replicate an actor’s voice or create an original synthetic voice.

Previously, Custom Neural Voice had limited access, requiring customers to apply and be approved by Microsoft in order to use it.

To address concerns about potential deepfake misuse, Microsoft implemented controls within Custom Neural Voice to prevent abuse of the service. When a customer submits a recording, the voice actor, if one is involved, must provide a statement acknowledging their understanding of the technology and awareness that the customer intends to create a voice. The recording is then compared through speaker verification to ensure a match before the customer can begin generating a voice.

Microsoft also requires customers to obtain consent from voice talent through contractual agreements and adhere to a code of conduct before using Custom Neural Voice. Additionally, Microsoft offers watermarking and detection tools to facilitate the identification of audio clips created with Custom Neural Voice.

While these controls, if effective, may not completely resolve the licensing and consent controversies surrounding voice cloning technology, Microsoft has evidently decided not to engage in that particular battle.

You may also like these posts

ElevenLabs New AI Voice Isolator

MIT Study: ChatGPT Increases Writing Efficiency

GitHub Copilot Workspace: Revolutionizing Developer Environments with AI