DeepSeek’s R1 and OpenAI’s Deep Research just redefined AI — RAG, distillation, and custom models will never be the same

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Things are moving quickly in AI—and if you’re not keeping up, you’re falling behind.

Two recent developments are reshaping the landscape for developers and enterprises alike: DeepSeek’s R1 model release and OpenAI’s new Deep Research product. Together, they’re redefining the cost and accessibility of powerful reasoning models, which has been well reported on. Less talked about, however, is how they’ll push companies to use techniques like distillation, supervised fine-tuning (SFT), reinforcement learning (RL), and retrieval-augmented generation (RAG) to build smarter, more specialized AI applications.

After the initial excitement around the amazing achievements of DeepSeek begins to settle, developers and enterprise decision-makers need to consider what it means for them. From pricing and performance to hallucination risks and the importance of clean data, here’s what these breakthroughs mean for anyone building AI today.

Cheaper, transparent, industry-leading reasoning models – but through distillation

The headline with DeepSeek-R1 is simple: It delivers an industry-leading reasoning model at a fraction of the cost of OpenAI’s o1. Specifically, it’s about 30 times cheaper to run, and unlike many closed models, DeepSeek offers full transparency around its reasoning steps. For developers, this means you can now build highly customized AI models without breaking the bank—whether through distillation, fine-tuning, or simple RAG implementations.

Distillation, in particular, is emerging as a powerful tool. By using DeepSeek-R1 as a “teacher model,” companies can create smaller, task-specific models that inherit R1’s superior reasoning capabilities. These smaller models, in fact, are the future for most enterprise companies. The full R1 reasoning model can be too much for what companies need – thinking too much, and not taking the decisive action companies need for their specific domain applications. “One of the things that no one is really talking about in, certainly in the mainstream media, is that actually the reasoning models are not working that well for things like agents,” said Sam Witteveen, an ML developer who works on AI agents, which are increasingly orchestrating enterprise applications.

As part of its release, DeepSeek distilled its own reasoning capabilities onto a number of smaller models, including open-source models from Meta’s Llama family and Alibaba’s Qwen family, as described in its paper. It’s these smaller models that can then be optimized for specific tasks. This trend toward smaller, fast models to serve custom-built needs will accelerate: there will be armies of them. “We are starting to move into a world now where people are using multiple models. They’re not just using one model all the time,” said Witteveen. And this includes the low-cost, smaller closed-sourced models from Google and OpenAI as well. “Meaning that models like Gemini Flash, GPT-4o Mini, and these really cheap models actually work really well for 80% of use cases,” he said.

If you work in an obscure domain, and have resources: Use SFT…

After the distilling step, enterprise companies have a few options to make sure the model is ready for their specific application. If you’re a company in a very specific domain, where details around the domain are not on the web or in books – where LLMs can train on them – you can inject it with your own domain-specific data sets, in a process called supervised fine tuning (SFT). One example would be the ship container-building industry, where specifications, protocols and regulations are not widely available.

DeepSeek showed that you can do this well with “thousands” of question-answer data sets. For an example of how others can put this into practice, Chris Hay, an IBM engineer, demonstrated how he fine-tuned a small model using his own math-specific datasets to achieve lightning-fast responses—outperforming OpenAI’s o1 on the same tasks (See his hands-on video here)

…and a little RL

Additionally, companies wanting to train a model with additional alignment to specific preferences – for example making a customer support chatbot sound empathetic while being concise – will want to do some reinforcement learning (RL) on the model. This is also good if a company wants its chatbot to adapt its tone and recommendation based on a user’s feedback. As every model gets good at everything, “personality” is going to be increasingly big, said Wharton AI professor Ethan Mollick on X yesterday.

These SFT and RL steps can be tricky for companies to implement well, however. Feed the model with data from one specific domain area, or tune it to act a certain way, and it suddenly becomes useless for doing tasks outside of that domain or style.

For most companies, RAG will be good enough

For most companies, however, retrieval-augmented generation (RAG) is the easiest and safest path forward. RAG is a relatively straight-forward process that allows organizations to ground their models with proprietary data contained in their own databases — ensuring outputs are accurate and domain-specific. Here, an LLM feeds a user’s prompt into vector and graph databases, in order to search information relevant to that prompt. RAG processes have gotten very good at finding only the most relevant content.

This approach also helps counteract some of the hallucination issues associated with DeepSeek, which currently hallucinates 14% of the time compared to 8% for OpenAI’s o3 model, according to a study done by Vectara, a vendor that helps companies with the RAG process.

This distillation of models plus RAG is where the magic will come for most companies. It has become so incredibly easy to do, even for those with limited data science or coding expertise. I personally downloaded the DeepSeek distilled 1.5b Qwen model, the smallest one, so that it could fit nicely on my Macbook Air. I then loaded up some PDFs of job applicant resumes into a vector database, and then asked the model to look over the applicants to tell me which ones were qualified to work at VentureBeat. (In all, this took me 74 lines of code, which I basically borrowed from others doing the same).

I loved that the Deepseek distilled model showed its thinking process behind why or why not it recommended each applicant — a transparency that I wouldn’t have gotten easily before Deepseek’s release.

In my recent video discussion on DeepSeek and RAG, I walked through how simple it has become to implement RAG in practical applications, even for non-experts. Sam Witteveen also contributed to the discussion by breaking down how RAG pipelines work and why enterprises are increasingly relying on them instead of fully fine-tuning models. (Watch it here).

OpenAI Deep Research: Extending RAG’s capabilities — but with caveats

While DeepSeek is making reasoning models cheaper and more transparent, OpenAI’s Deep Research announced Sunday, represents a different but complementary shift. It can take RAG to a new level by crawling the web to create highly customized research. The output of this research can then be inserted as input into the RAG documents companies can use, alongside their own data.

This functionality, often referred to as agentic RAG, allows AI systems to autonomously seek out the best context from across the internet, bringing a new dimension to knowledge retrieval and grounding.

Open AI’s Deep Research is similar to tools like Google’s Deep Research, Perplexity and You.com, but OpenAI tried to differentiate its offering by suggesting its superior chain-of-thought reasoning makes it more accurate. This is how these tools work: A company researcher requests the LLM to find all the information available about a topic in a well-researched and cited report. The LLM then responds by asking the researcher to answer another 20 sub-questions to confirm what is wanted. The research LLM then goes out and performs 10 or 20 web searches to get the most relevant data to answer all those sub-questions, then extract the knowledge and present it in a useful way.

However, this innovation isn’t without its challenges. Amr Awadallah, the CEO of Vectara, cautioned about the risks of relying too heavily on outputs from models like Deep Research. He questions whether indeed it is more accurate: “It’s not clear that this is true,” Awadallah noted: “We’re seeing articles and posts in various forums saying no, they’re getting lots of hallucinations still and Deep Research is only about as good as other solutions out there on the market.”

In other words, while Deep Research offers promising capabilities, enterprises need to tread carefully when integrating its outputs into their knowledge bases. The grounding knowledge for a model should come from verified, human-approved sources to avoid cascading errors, Awadallah said.

The cost curve is crashing: why this matters

The most immediate impact of DeepSeek’s release is its aggressive price reduction. The tech industry expected costs to come down over time, but few anticipated just how quickly it would happen. DeepSeek has proven that powerful, open models can be both affordable and efficient, creating opportunities for widespread experimentation and cost-effective deployment.

Awadallah emphasized this point, noting that the real game-changer isn’t just the training cost—it’s the inference cost, which for DeepSeek is about 1/30th of OpenAI’s o1 or o3 for inference cost per token. “The margins that OpenAI, Anthropic, and Google Gemini were able to capture will now have to be squished by at least 90% because they can’t stay competitive with such high pricing,” Awadallah said.

Not only that, but those costs will continue to go down. Dario Amodei, CEO of Anthropic said recently that the cost of developing models continues to drop at around a 4x rate each year. It follows that the rate that LLM providers charge to use them will continue to drop as well. “I fully expect the cost to go to zero,” said Ashok Srivastava, chief data officer of Intuit, a company that has been driving AI hard in its tax and accounting software offerings like TurboTax and Quickbooks. “…and the latency to go to zero. They’re just going to be commodity capabilities that we will be able to use.”

This cost reduction isn’t just a win for developers and enterprise users; it’s a signal that AI innovation is no longer confined to big labs with billion-dollar budgets. The barriers to entry have dropped, and that’s inspiring smaller companies and individual developers to experiment in ways that were previously unthinkable. Most importantly, the models are so accessible that any business professional will be using them, not just AI experts, said Srivastava.

DeepSeek’s disruption: Challenging “Big AI’s” stronghold on model development

Most importantly, DeepSeek has shattered the myth that only major AI labs can innovate. For years, companies like OpenAI and Google positioned themselves as the gatekeepers of advanced AI, spreading the belief that only top-tier PhDs with vast resources could build competitive models.

DeepSeek has flipped that narrative. By making reasoning models open and affordable, it has empowered a new wave of developers and enterprise companies to experiment and innovate without needing billions in funding. This democratization is particularly significant in the post-training stages—like RL and fine-tuning—where the most exciting developments are happening.

DeepSeek exposed a fallacy that had emerged in AI—that only the big AI labs and companies could really innovate. This fallacy had forced a lot of other AI builders to the sidelines. DeepSeek has put a stop to that. It has given everyone inspiration that there’s a ton of ways to innovate in this area.

The Data imperative: Why clean, curated data is the next action-item for enterprise companies

While DeepSeek and Deep Research offer powerful tools, their effectiveness ultimately hinges on one critical factor: data quality. Getting your data in order has been a big theme for years, and accelerated over the past nine years of the AI era. But it has become even more important with generative AI, and now with DeepSeek’s disruption, it’s absolutely key. Hilary Packer, CTO of American Express, underscored this in an interview with VentureBeat yesterday: “The AHA moment for us, honestly, was the data. You can make the best model selection in the world… but the data is key. Validation and accuracy are the holy grail right now of generative AI.”

This is where enterprises must focus their efforts. While it’s tempting to chase the latest models and techniques, the foundation of any successful AI application is clean, well-structured data. Whether you’re using RAG, SFT, or RL, the quality of your data will determine the accuracy and reliability of your models.

And while many companies aspire to perfect their entire data ecosystems, the reality is that perfection is elusive. Instead, businesses should focus on cleaning and curating the most critical portions of their data to enable point AI applications that deliver immediate value.

Related to this, a lot of questions linger around the exact data that DeepSeek used to train its models on, and this raises questions about the inherent bias of the knowledge stored in its model weights. But that’s no different from questions around other open source models, such as Meta’s Llama model series. Most enterprise users have found ways to fine-tune or ground the models with RAG enough so that they can mitigate any problems around such biases. And that’s been enough to create serious momentum within enterprise companies toward accepting open source, indeed even leading with open source.

Similarly, there’s no question that many companies will be using DeepSeek models, regardless of the fear around the fact that the company is from China. Though it’s also true that a lot of companies in highly regulated companies such as finance or healthcare are going to be cautious about using any DeepSeek model in any application that interfaces directly with customers, at least in the short-term.

Conclusion: The future of enterprise AI Is open, affordable, and data-driven

DeepSeek and OpenAI’s Deep Research are more than just new tools in the AI arsenal—they’re signals of a profound shift, where enterprises will be rolling out masses of purpose-built models, extremely affordably, competent, and grounded in the company’s own data and approach.

For enterprises, the message is clear: the tools to build powerful, domain-specific AI applications are at your fingertips. You risk falling behind if you don’t leverage these tools. But real success will come from how you curate your data, leverage techniques like RAG and distillation, and innovate beyond the pre-training phase.

As AmEx’s Packer put it, the companies that get their data right will be the ones leading the next wave of AI innovation.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link