Cohere and AI polemics

I hate to be banal, but I have been getting deeper into studying and playing with AI over the past few weeks. Last month was earth-shattering in terms of AI developments. The most monumental of which is that I, Liddl’ NoBull Klev, have launched a ChatGPT competitor. [Lenny was a real AI chatbot I built for demonstration purposes, but have since taken down.]. I call it Lenny, after the character Lennie in Of Mice and Men.

As in the book (or movie), Lenny is not too bright and he’s a bit aggressive. It's just meant to be a fun demo. Still, in my testing, on some questions, Lenny beats world-class Canadian AI company Cohere. And it only took me a day to build Lenny - I still had time to eat three meals, two with ice cream.

Here’s Lenny addressing the simple task of coming up with 10 sentences that end in the word apple. You can see Lenny gets this mostly, but not entirely, right.

You will also see near the end that I trained Lenny to be assertive. Now, here is Cohere - incapable of producing a single sentence ending in apple:

Wow, even Sleepy Joe Biden could have done better. Anyways, those are just anecdotes. Overall, I concede that Cohere is smarter than Lenny. But they are not too far apart - both Lenny and Cohere would broadly rank as “C” students. I will address my project and more rigorous metrics later. Let’s start with the official news last month that Toronto’s Cohere has raised US$500m at a valuation of $5.5B - only 5 years after its founding. Now, let’s wheel in John Ruffolo into this, he’s always good for polemics.

0:00

/0:33

Ruffolo on Cohere prospects.

These are comments from 3 months ago, when the first press reports of Cohere’s financing surfaced. As you know, I have always been respectful - deferential, even - towards John Ruffolo. But this time, I have to call it: he doesn’t know what he’s talking about. It is true that Cohere is broadly in the Large Language Model (LLM) race. I will cover some basics, just so everyone can follow. I myself immersed in this fairly recently and once you understand the dynamics of the race, it’s as interesting as watching sports or politics. LLMs are the data models that power chatbots like ChatGPT (or for that matter, Lenny). LLMs are a form of AI that primarily deal with processing language, though they can power many more applications besides chat. LLMs are “trained” by being fed massive amounts of information found on the internet. ChatGPT’s current "frontier" LLM is GPT-4o (frontier meaning leading edge). It is widely acknowledged as the best LLM at the moment. Google is definitely in contention as well. Its latest model, released just this week, has essentially caught up, maybe even surpassed GPT-4o. Microsoft owns a large chunk of ChatGPT parent OpenAI and they’re in a close partnership. And so when Ruffolo implies that Microsoft and Google are presumptive favorites, he is right. But when he mentions Cohere in the same breath as a contender for what he speculates will be a Big Three, he is wrong. There’s already another 800-pound gorilla vying for domination - never mind third place. And that’s Mark Zuckerberg (aka Facebook, aka Meta). Last month, the world once again saw a demonstration of the adage “Don’t f**** with Zuck.”

You will recall that Meta’s stock plunged 60% or so in 2022 as Zuck had become consumed with building the “Metaverse”. In 2023, Zuck pivoted his focus to AI in a bid to catch up with OpenAI. He deployed a classic tech playbook when a player is behind: he decided to commodify the whole space by giving away his LLM. Meta’s LLM, called Llama, is open source, meaning anyone can use it for free. Last month, Meta released version 3.1 and on many objective tests, it has achieved near-parity with the leading models. You could see Meta’s trajectory well before this week.

LLMs all use the same general architecture called “Transformers”, which was initially developed by Google. (While an intern at Google, Cohere’s CEO Aidan Gomez was part of the group that authored the paper on the topic). Since they all use similar techniques, the ability to spend the money to acquire computing power is a major lever. Zuck has the execution ability, billions of his own money and doesn’t have to answer to anyone. This makes him a formidable competitor in the LLM race. How was I able to develop Lenny in a day? I didn’t have to train my own LLM. Because of Zuck’s drastic decision to go open source, everyone has access to a very credible LLM for free (aside from cloud computing costs). I specifically use the 8 billion parameters version of Meta’s LLM 3.1. The smartest version is the 405B version, but my cloud provider doesn’t offer that yet. And so my chatbot is not the smartest, though, as I mention, it beats Cohere at some tasks.

Cohere is a startup staffed by smart people in a very promising and fast-moving field - anything is possible. They’re a team of Gen Z, first-time startup founders in the tough-to-crack enterprise market, in a full-frontal war against the Big Tech in a space that everyone considers of vital strategic importance. Can you recall any other time in tech history this happened? Only the early years of Microsoft and Apple spring to mind. I wish Cohere well. In Part 2, I will address the specific prediction Ruffolo made - that Cohere is in contention to be a Big 3 LLM provider.

Cohere and AI polemics

Reza Satchu update

Cohere and the quest for a competitive advantage

Daniel Debow, the Forrest Gump of Canadian tech

OPM WIRE