Google introduced three key AI products in one day and simultaneously reduced the price of the top-tier Gemini Ultra plan from $250 to $100 per month. At the same time, the company completely abandoned query limits and is transitioning to a compute-based pricing model.
Google I/O 2026 was the company's most AI-rich event in recent years: the release of the Gemini 3.5 Flash and Gemini Omni models, the launch of the Gemini Spark personal agent, and a radical restructuring of the pricing plan. For businesses, this means cheaper access to top models and a new wave of Gemini-based services in the Google Workspace, YouTube, and Android ecosystems. Gemini 3.5 Flash already powers Google's AI search mode and new AI Overviews, while Omni combines text, image, and video in a single pipeline. For Kazakhstani companies and integrators, including companies like Alashed IT (it.alashed.kz), this is an opportunity to create products on a globally competitive level.
Gemini 3.5 Flash: a new basic AI model for search and business
Gemini 3.5 Flash is the first model in the 3.5 line and is already available globally, including markets in Europe, the USA, and several Asian countries. According to Google, this model now underpins the AI search mode in Google Search AI Mode and the AI Overviews feature, which inserts generated answers into the search results. Flash is positioned as a faster and more economical alternative to heavy models, aimed at mass user and business scenarios. This is critical because search and office tasks require minimal latency and predictable costs.
For developers, 3.5 Flash is connected via API and is already used in Antigravity 2.0, an updated environment for generating interactive content. At Google I/O, it was announced that the model is optimized for high load: it handles short text queries, complex chains of prompts, and is used as the 'engine' for a host of new features, from Gmail Live to Docs Live. In fact, Google is shifting a significant portion of consumer AI traffic to a single optimized model, simplifying cost planning for large clients.
For corporate users, this means more stable performance and predictability. Products like Gemini 3.5 Flash allow companies to build their own assistants for call centers, internal document search, and employee support. Integrators like Alashed IT (it.alashed.kz) can set up hybrid solutions: using Flash for mass tasks (FAQs, classification, summarization) and connecting more powerful models only for niche scenarios requiring complex reasoning or multimodality. In the context of rising AI infrastructure costs, such architectures are becoming the standard.
Importantly, Flash is already integrated into the updated Gemini interface and Antigravity 2.0, which is focused on creating interactive content. This allows for covering multiple areas—search, office tasks, and creativity—with a single model without the need to switch stacks. For businesses, this reduces complexity and time to market for new features: it is enough to set up integration with Gemini 3.5 Flash once, and then expand functionality through updates from Google without major modifications to internal systems.
Gemini Omni: a unified multimodal model for text, images, and video
Gemini Omni is a new model family that combines Gemini's knowledge and logical reasoning with the generative capabilities of Nano Banana (Google's image model) and Veo (video model). The key feature of Omni is that it accepts any format—text, images, audio, video—and can output video based on factual data. This is not just a generative image or video, but content linked to the real world, based on knowledge and search.
The model is already available to Google AI Plus, Pro, and Ultra subscribers in the Gemini app, Google Flow studio, and YouTube Shorts for creating short videos. Such coverage across multiple products makes Omni the center of Google's creative ecosystem. Marketers can upload text scripts, briefs, and references and get videos adapted for the YouTube Shorts format. Content teams can turn existing text materials and presentations into visual videos for social media and internal training.
From a technical perspective, combining Nano Banana and Veo within a single model means that there is no need to manually 'glue' multiple services together. Omni manages the pipeline itself: analyzes the request, accesses the knowledge base, creates a storyboard, and generates the final video. For developers, this reduces integration complexity: one API endpoint instead of several. Companies like Alashed IT (it.alashed.kz) can build solutions for e-commerce, education, or media where fast video content production in large volumes is important.
For businesses in Central Asia, this opens up the opportunity to scale multimedia content production without a large studio team. Local brands will be able to produce dozens of videos a day for product promotion, staff training, or explaining complex services (e.g., fintech or telecom). Since Omni is integrated into YouTube Shorts, companies get a direct content distribution channel, bypassing complex rendering and post-production chains. In the coming months, we can expect the appearance of ready-made templates and presets adapted for various industries and formats.
Gemini Spark and personal AI agents 24/7 in the cloud
A separate major announcement at Google I/O 2026 is Gemini Spark, a personal AI agent running 24/7 on Google Cloud virtual machines. Importantly, Spark continues to perform tasks even when the user's laptop or smartphone is turned off. This is a fundamental difference from classic chatbots: Spark functions as an autonomous employee who interacts with Google Workspace, third-party applications, and the web without constant user involvement.
The service will launch next week for Google AI Ultra subscribers in the US, and integration with Chrome is planned for summer this year. This means that in the coming months, it will be possible to entrust Spark with routine tasks: monitoring email, preparing reports in Google Sheets, updating tasks in Trello or Jira, searching for potential clients on LinkedIn, and processing incoming requests from the website. In practice, this is the equivalent of a dedicated middle-level assistant, but with a subscription fee instead of a salary.
Technically, Spark uses the Gemini 3.5 Flash model and the Antigravity framework. The latter ensures stable background operation of agents and coordination of their actions with each other. Companies can create multiple Spark agents for different tasks: one monitors CRM and reports, another monitors ad campaign analytics, and a third monitors brand mentions. In combination with Google Cloud, this becomes a full-fledged level of business process automation.
For Kazakhstani and Central Asian companies, this is a chance to reconsider their approach to outsourcing and internal operations. Integrators like Alashed IT (it.alashed.kz) will be able to offer services for designing and configuring Spark agents for specific processes: from preparing tender documentation to inventory control. In the context of a shortage of qualified IT and analytics personnel, autonomous agents can cover up to 20–40 percent of the routine workload of office employees. At the same time, the business pays a fixed subscription, and scaling is achieved by connecting additional agents without complex hiring.
Reduction in AI Ultra price and new compute-based pricing model
Against the backdrop of product announcements, an important strategic change is the sharp reduction in the price of the top-tier Google AI Ultra plan from $250 to $100 per month. The new plan includes five times higher usage limits for the Gemini application compared to the current AI Pro plan at $20, 20 terabytes of cloud storage, a YouTube Premium subscription, and early access to Gemini Spark. The previous $250 plan remains available at $200, with the same set of features, effectively occupying the upper segment for the most demanding users.
At the same time, Google is completely abandoning daily query limits across all Gemini plans. Instead, a compute-based billing model is introduced. A simple text query will now consume significantly less quota than generating a video or a complex multimodal response. Limits are updated every five hours until the weekly quota is reached. This approach is closer to the cloud pricing models adopted in infrastructure services and allows for more accurate cost planning.
For businesses, this means the ability to flexibly distribute the load: cheap scenarios (chat support, email summarization, standard reports) can be processed thousands of times without significantly affecting the budget, while more expensive operations, such as generating videos through Gemini Omni, will be planned point-by-point. An important effect is predictability for integrators who build products on top of Gemini and sell them to their clients on a subscription basis. Now they can include specific compute quotas in the cost of the service instead of abstract 'query limits'.
Companies like Alashed IT (it.alashed.kz) get the opportunity to design pricing plans for their clients, linking the cost to the types of tasks rather than the raw number of queries. For example, 50,000 text operations and 500 multimodal tasks per month. This makes the model transparent to CFOs and CIOs responsible for digital transformation budgets. Combined with the drop in the AI Ultra price to $100, the barrier to entry for Google's top-tier AI is significantly reduced, opening access to advanced features not only for corporations but also for medium-sized businesses.
Integration of Gemini into the ecosystem: Ask YouTube, Universal Cart, and Siri
In addition to models and pricing, Google showed how Gemini is embedded in key consumer and business products. Ask YouTube, launched as part of I/O 2026, is already available today for YouTube Premium subscribers in the US via youtube.com/new. This is a Gemini-based search dialogue layer that handles complex multi-step queries and returns not just videos, but specific relevant fragments of videos. For educational and B2B content, this is a radical simplification of access to the right information: an employee can ask 'how to configure Kubernetes autoscaling' and immediately get a specific timestamp from the correct video.
Universal Cart is another significant announcement. This is a unified AI cart that works across search, the Gemini app, YouTube, and Gmail. The user can add products from different sources and place an order either through Google's infrastructure or directly with retailers. To do this, Google has partnered with Amazon, Shopify, and Walmart and introduced a new open standard, the Universal Commerce Protocol (UCP), for cross-merchant AI commerce. This is a step towards AI not just recommending products, but managing the entire purchase funnel.
At the platform level, an important signal was the statement by Google Cloud CEO Thomas Kurian at the Google Cloud Next 2026 conference in Las Vegas. He confirmed that Gemini will be the basis for a new, more personalized version of Siri, which will be released later this year. As part of the collaboration, Google is the preferred cloud provider for developing the next generation of Apple Foundation Models based on Gemini technologies. A roadmap has been outlined: spring 2026—Gemini's assistance in Siri's contextual awareness in iOS 26.4, fall 2026 (iPhone 18 release)—a full-fledged conversational Siri with multi-turn dialogue and complex task execution.
For integrators and companies in Kazakhstan and Central Asia, this close integration of Gemini into global platforms means that end users will increasingly interact with AI through familiar applications—YouTube, Gmail, mobile OS. This shifts the focus from developing 'from scratch' to building add-ons and specialized services on top of existing interfaces. Companies like Alashed IT (it.alashed.kz) can focus on industry expertise and automation of clients' internal processes, using Ask YouTube and Universal Cart as a ready-made infrastructure for training staff and e-commerce.
Что это значит для Казахстана
For Kazakhstan and Central Asia, the announced changes by Google create several practical opportunities. Firstly, the reduction in the AI Ultra price to $100 per month makes top-tier Gemini models accessible not only to large banks and telecom operators, but also to medium-sized businesses, IT outsourcers, educational, and media companies. Even with regional markups, the annual cost of access to Ultra for a small team fits within a budget of around $1,500–$2,000, which is comparable to the salary of one junior specialist.
Secondly, the launch of Gemini 3.5 Flash and Omni creates a basis for localized solutions in Kazakh and Russian, especially in customer support and multimedia content. Integrators, including companies like Alashed IT (it.alashed.kz), can already design services for local banks, telecom operators, retail, and government companies: voice and text bots, internal document search, video instructions, and training materials. The availability of ready-made products like Ask YouTube and Universal Cart simplifies the launch of such solutions on the market by using familiar platforms.
Thirdly, businesses in Central Asia face a shortage of qualified personnel in data analytics and DevOps. Personal Gemini Spark agents, running 24/7 in the cloud, can potentially cover a significant portion of routine tasks: monitoring reports, updating CRM, preparing presentations, and documentation. Even if Spark is initially available only in select regions, experience shows that such services expand to new markets within 6–12 months. Companies in Kazakhstan should prepare their infrastructure and processes for working with AI agents now, in order to use them as soon as the service becomes officially available.
Finally, the growing integration of Gemini with global ecosystems like Apple and Google makes the choice of local partners and system integrators critically important. Outsourcing companies like Alashed IT (it.alashed.kz), which understand the specifics of local regulation, languages, and business culture, will become a key link between global AI platforms and the real needs of Kazakhstani clients.
Google reduced the price of the top-tier AI Ultra plan from $250 to $100 per month, while abandoning daily query limits in favor of a compute-based pricing model.
The lineup of announcements at Google I/O 2026 shows that the AI market is rapidly shifting from individual models to comprehensive ecosystems and autonomous agents. Gemini 3.5 Flash, Omni, and Spark combine search, office tasks, and multimedia content into a single infrastructure. For businesses in Kazakhstan and Central Asia, this is not only a reduction in access costs to advanced models, but also a chance to quickly build new products based on global platforms. Companies that start systematically implementing AI and working with partners like Alashed IT (it.alashed.kz) in 2026 will gain a significant competitive advantage in the coming years.
Часто задаваемые вопросы
What is Gemini 3.5 Flash and how is it useful for business?
Gemini 3.5 Flash is a new Google model optimized for speed and cost, already working in AI search mode and AI Overviews. It is suitable for mass tasks: chat support, summarization, internal document search. For businesses, this means fast responses with sub-second latency and predictable costs due to compute-based pricing. Integrators like Alashed IT (it.alashed.kz) can build corporate assistants and support bots based on it.
How does Gemini Omni differ from other AI models for video?
Gemini Omni combines text, image, audio, and video in a single model and uses Gemini's knowledge along with the generative capabilities of Nano Banana and Veo. Unlike separate video models, Omni can accept any type of input data and output video based on facts and search data. It is built into Gemini, Google Flow, and YouTube Shorts, which simplifies content generation and publishing. For businesses, this is the ability to create dozens of videos a day without a separate studio and complex pipeline.
What are the risks associated with using Gemini Spark as an AI agent?
The main risks of Gemini Spark are related to the security of access to corporate data, the quality of automated actions, and the dependence on Google's cloud infrastructure. The agent works 24/7 and can interact with Gmail, documents, and third-party services, so clear configuration of rights and logging of actions is required. Errors in tasks or access policies can lead to incorrect email sends or data changes. Therefore, it is better to implement Spark through experienced integrators, such as Alashed IT (it.alashed.kz), with a phased expansion of the agent's permissions.
How long does it take to implement Gemini-based solutions for a company?
A typical pilot project based on Gemini 3.5 Flash or Omni takes 4 to 8 weeks: 1–2 weeks for process analysis and design, 2–3 weeks for integration with existing systems, and 1–3 weeks for testing. For simple scenarios, such as an FAQ chatbot or document summarization, an MVP can be launched in 2–3 weeks. More complex multimodal solutions with video and deep integration into CRM/ERP may require 3–4 months. Companies like Alashed IT (it.alashed.kz) typically offer a phased approach with a quick pilot and subsequent scaling.
How can a business save on using Gemini's new AI services from Google?
Savings are achieved by correctly distributing the load between models and types of tasks. Inexpensive operations (plain text, simple analytics) are best handled by Gemini 3.5 Flash and using compute quotas as efficiently as possible. Expensive video generation through Omni should be limited to clearly defined scenarios and limits. Switching to the AI Ultra plan at $100 per month is justified if the team uses AI daily; otherwise, AI Pro at $20 is sufficient. Integrators like Alashed IT (it.alashed.kz) help configure the architecture to reduce the total cost of ownership by 20–40 percent through query optimization and caching results.
Читайте также
- Reflection AI привлекает $2,5 млрд при оценке $25 млрд
- Agentic AI меняет кибербезопасность бизнеса в 2026 году
- Mistral AI привлекла €722 млн на дата-центры в Европе
Источники
Фото: Dmitrii E. / Unsplash