NVIDIA Aktienkurs
Insights zu NVIDIA
Insights
Mit KI besser investieren
aktien.guide Unlimited – alle Details der KI-Analysen
👉 Detailliertere Insights
👉 Exklusive Einblicke in Chancen & Risiken
👉 Klare Antworten auf deine Fragen
Mit KI besser investieren
aktien.guide Unlimited – alle Details der KI-Analysen
👉 Detailliertere Insights
👉 Exklusive Einblicke in Chancen & Risiken
👉 Klare Antworten auf deine Fragen
Mit KI besser investieren
aktien.guide Unlimited – alle Details der KI-Analysen
👉 Detailliertere Insights
👉 Exklusive Einblicke in Chancen & Risiken
👉 Klare Antworten auf deine Fragen
Mit KI besser investieren
aktien.guide Unlimited – alle Details der KI-Analysen
👉 Detailliertere Insights
👉 Exklusive Einblicke in Chancen & Risiken
👉 Klare Antworten auf deine Fragen
Jetzt kostenlos registrieren, um einen Alarm für die NVIDIA Aktie zu aktivieren.
Aktiviere Alarme zum Aktienkurs, zur Dividendenrendite, zur Bewertung (z. B. KGV oder EV/Sales) oder zu Strategie-Scores und lehne Dich entspannt zurück.
aktien.guide Basis
Kennzahlen
📘 Marktkapitalisierung
📈 Was ist das?
Die Marktkapitalisierung zeigt, wie viel ein Unternehmen laut Börse aktuell wert ist.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Sie hilft Unternehmen in Größenklassen (Large, Mid, Small Cap) einzuordnen und gibt Hinweise auf Marktmacht und Stabilität.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Große Unternehmen gelten als stabiler, zahlen oft Dividenden, wachsen aber langsamer.
- Kleine Firmen können stärker wachsen, sind aber schwankungsanfälliger.
- Die Marktkapitalisierung ist ein guter Indikator für Unternehmensgröße, aber kein Maß für Unter- oder Überbewertung.
📘 Enterprise Value (Unternehmenswert)
📈 Was ist das?
Der Enterprise Value (EV) zeigt, was ein Unternehmen tatsächlich kostet, wenn man es komplett übernehmen würde – inklusive Schulden und abzüglich Cash.
🧮 Wie wird es berechnet?
(= Marktkapitalisierung + Nettoverschuldung)
🏛️ Wofür ist es wichtig?
Der EV ist eine realistischere Bewertungsbasis als die Marktkapitalisierung, da er die Kapitalstruktur berücksichtigt. Er ist Grundlage für Kennzahlen wie EV/FCF oder EV/Sales.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Der Enterprise Value zeigt, was ein Unternehmen tatsächlich wert ist – unabhängig davon, wie es finanziert ist.
- Er ist besonders wichtig für professionelle Investoren, da er eine objektivere Grundlage für Bewertungsvergleiche bietet als die Marktkapitalisierung allein.
- Ein Unternehmen mit hoher Verschuldung erscheint im EV teurer, eines mit viel Cash günstiger – auch wenn sie an der Börse gleich viel wert sind.
📘 Nettoverschuldung
📈 Was ist das?
Die Nettoverschuldung zeigt, wie viele Schulden nach Abzug des verfügbaren Cashs tatsächlich verbleiben.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Sie zeigt, wie stark ein Unternehmen von Fremdkapital abhängig ist – und wie gut es in der Lage ist, seine Schulden kurzfristig zu bedienen.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine niedrige oder negative Nettoverschuldung bedeutet hohe finanzielle Stabilität.
- Unternehmen mit viel Cash und geringer Verschuldung sind besser gerüstet für Krisen.
- Eine hohe Nettoverschuldung erhöht das Risiko – besonders bei steigenden Zinsen oder konjunkturellen Schwächen.
📘 Cash
📈 Was ist das?
Der Cashbestand zeigt, wie viele liquide Mittel einem Unternehmen sofort zur Verfügung stehen.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Er gibt Auskunft über die finanzielle Flexibilität: Ein hoher Cashbestand ermöglicht Investitionen, Rückkäufe oder Krisenresistenz.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein hoher Cashbestand zeigt finanzielle Stärke und Handlungsspielraum.
- Cash kann für Investitionen, Schuldentilgung oder Aktienrückkäufe genutzt werden.
- Allerdings: Zu viel ungenutztes Kapital kann auch auf mangelnde Investitionsideen hinweisen.
📘 Anzahl ausstehender Aktien
📈 Was ist das?
Die Anzahl ausstehender Aktien gibt an, wie viele Aktien eines Unternehmens aktuell im Umlauf sind und von Investoren gehalten werden.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Sie ist die Grundlage für viele Kennzahlen wie Gewinn je Aktie (EPS), Marktkapitalisierung oder KGV.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Je weniger Aktien im Umlauf sind, desto höher fällt z. B. der Gewinn je Aktie aus – wichtig für Bewertung und Dividendenrendite.
- Aktienrückkäufe verringern die Anzahl ausstehender Aktien – und steigern den Wert je Aktie.
- Kapitalerhöhungen haben den gegenteiligen Effekt: mehr Aktien → Verwässerung der bestehenden Anteile.
📘 Kurs-Gewinn-Verhältnis (KGV)
📈 Was ist das?
Das KGV zeigt, wie oft der Gewinn pro Aktie im aktuellen Aktienkurs enthalten ist – also wie „teuer“ eine Aktie im Verhältnis zum Gewinn ist.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Das KGV gehört zu den bekanntesten Bewertungskennzahlen. Es hilft Anlegern einzuschätzen, ob eine Aktie im Vergleich zu ihrem Gewinn eher günstig oder teuer erscheint.
🧮 Berechnung
📊 KGV (TTM) = bezogen auf den Gewinn der letzten 12 Monate (Trailing Twelve Months):🎯 Was bedeutet das für Anleger?
- Ein niedriges KGV kann auf eine günstige Bewertung hindeuten – oder auf Probleme im Geschäftsmodell.
- Ein hohes KGV kann Wachstumserwartungen widerspiegeln – oder eine überbewertete Aktie.
📘 Kurs-Umsatz-Verhältnis (KUV)
📈 Was ist das?
Das KUV zeigt, wie viel Anleger für 1 € Umsatz eines Unternehmens zahlen – unabhängig vom Gewinn.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Das KUV ist besonders bei wachstumsstarken oder noch nicht profitablen Unternehmen hilfreich. Es zeigt, wie hoch der Umsatz an der Börse bewertet wird.
🧮 Berechnung
Marktkapitalisierung = 4,84 Bio. $ | Umsatz (TTM) = 253,49 Mrd. $
Marktkapitalisierung = 4,84 Bio. $ | Umsatz erwartet = 400,97 Mrd. $
🎯 Was bedeutet das für Anleger?
- Ein niedriges KUV kann auf Unterbewertung hindeuten – oder auf schwache Margen.
- Ein hohes KUV kann hohe Erwartungen widerspiegeln – oder übermäßigen Optimismus.
- Besonders sinnvoll bei Wachstumsunternehmen, bei denen der Gewinn oder Free Cashflow (noch) keine Aussagekraft hat.
📘 Unternehmenswert zu Umsatz (EV/Sales)
📈 Was ist das?
EV/Sales zeigt, wie viel Anleger für 1 € Umsatz eines Unternehmens zahlen, wenn man auch Schulden und Cash berücksichtigt – es ist eine kapitalstrukturbereinigte Version des KUV.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Diese Kennzahl eignet sich besonders für den Vergleich von Unternehmen mit unterschiedlicher Verschuldung – sie zeigt, wie teuer ein Unternehmen tatsächlich im Verhältnis zum Umsatz ist.
🧮 Berechnung
Enterprise Value = 4,79 Bio. $ | Umsatz (TTM) = 253,49 Mrd. $
Enterprise Value = 4,79 Bio. $ | Umsatz erwartet = 400,97 Mrd. $
🎯 Was bedeutet das für Anleger?
- EV/Sales ist neutral gegenüber der Kapitalstruktur und eignet sich gut für Unternehmensvergleiche.
- Ein niedriges Verhältnis kann auf eine günstig bewertete Aktie hindeuten – ein hohes Verhältnis auf hohe Erwartungen oder Überbewertung.
- Besonders nützlich bei wachstumsstarken, noch nicht profitablen Firmen.
📘 Unternehmenswert zu Free Cashflow (EV/FCF)
📈 Was ist das?
EV/FCF zeigt, wie viele Jahre es dauern würde, bis ein Unternehmen seinen Unternehmenswert durch freien Cashflow „zurückverdient”.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Diese Kennzahl hilft, Unternehmen auf Basis ihrer tatsächlichen Cash-Erträge zu bewerten – unabhängig von Bilanzierungsregeln oder buchhalterischem Gewinn.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein niedriges EV/FCF deutet auf eine günstige Bewertung bei starker Cashgenerierung hin.
- Ein hohes EV/FCF kann entweder auf Optimismus oder auf temporär schwachen Cashflow hindeuten.
- Besonders hilfreich bei reifen, profitablen Unternehmen mit stabilen Cashflows.
📘 Kurs-Buchwert-Verhältnis (KBV)
📈 Was ist das?
Das KBV zeigt, wie hoch der Marktwert eines Unternehmens im Verhältnis zu seinem bilanziellen Eigenkapital ist.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Das KBV ist besonders bei Substanzwerten (z. B. Banken, Industrie) relevant. Es hilft Anlegern zu erkennen, ob ein Unternehmen unter oder über seinem buchhalterischen Vermögen bewertet ist.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein KBV unter 1 kann auf Unterbewertung oder schwache Rentabilität hindeuten.
- Ein KBV über 1 zeigt, dass der Markt dem Unternehmen Mehrwert über den Buchwert hinaus zuschreibt (z. B. Marken, Patente, Wachstum).
- Das KBV eignet sich besonders gut für Unternehmen mit stabilen, materiellen Vermögenswerten.
📘 Dividende je Aktie
📈 Was ist das?
Die Dividende je Aktie zeigt, wie viel Geld ein Unternehmen pro Aktie an seine Aktionäre ausschüttet – typischerweise jährlich oder quartalsweise.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Sie ist die absolute Größe der Auszahlung je Aktie – wichtig für alle, die regelmäßige Erträge suchen oder Dividendenstrategien verfolgen.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine stabile oder wachsende Dividende je Aktie ist oft ein Zeichen für ein solides Geschäftsmodell.
- Die Dividende je Aktie allein sagt aber nichts über die Rendite – dafür ist auch der Aktienkurs relevant (→ Dividendenrendite).
- Langfristig steigende Dividenden sind oft ein sehr gutes Merkmal (z. B. Dividenden-Aristokraten).
📘 Dividendenrendite
📈 Was ist das?
Die Dividendenrendite zeigt, wie hoch die Dividende eines Unternehmens im Verhältnis zum Aktienkurs ist.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Sie hilft dabei, Dividendenaktien vergleichbar zu machen – unabhängig vom absoluten Auszahlungsbetrag.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine stabile Dividendenrendite kann auf verlässliche Ausschüttungen hinweisen.
- Ein Vergleich der 1J- und 5J-Rendite hilft zu erkennen, ob das Dividendenwachstum mit dem Kurswachstum Schritt hält.
- Eine niedrige Rendite ist nicht zwingend negativ – sie kann auf starkes Kurswachstum hindeuten.
📘 Dividendenwachstum
📈 Was ist das?
Das Dividendenwachstum zeigt, wie stark ein Unternehmen seine Dividende je Aktie über die Zeit gesteigert hat.
🧮 Wie wird es berechnet?
5J: durchschnittliche jährliche Wachstumsrate (CAGR)
🏛️ Wofür ist es wichtig?
Stetig steigende Dividenden gelten als Zeichen für finanzielle Stärke und Aktionärsorientierung – besonders interessant für langfristige Investoren.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein stabiles Dividendenwachstum ist ein Zeichen nachhaltiger Ertragskraft.
- Ein hohes Dividendenwachstum kann ein erheblicher Hebel deiner Rendite sein:
- Wenn ein Unternehmen z. B. 1 € Dividende zahlt und diese über 5 Jahre jährlich um 15 % erhöht, bekommst du im 5. Jahr bereits 2 € je Aktie – doppelt so viel wie zu Beginn!
📘 Ausschüttungsquote (Payout)
📈 Was ist das?
Die Ausschüttungsquote zeigt, wie viel Prozent des Unternehmensgewinns (pro Aktie) als Dividende an die Aktionäre ausgeschüttet wird.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Die Quote hilft einzuschätzen, ob eine Dividende auf Dauer tragfähig ist – besonders im Verhältnis zum erzielten Gewinn.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine niedrige Ausschüttungsquote bedeutet: Das Unternehmen behält einen größeren Teil des Gewinns für Investitionen – typisch für Wachstumsunternehmen.
- Eine moderate Quote (z. B. 25–50 %) steht oft für ein gesundes Gleichgewicht zwischen Ausschüttung und Zukunftsinvestitionen.
- Hohe Ausschüttungsquoten können attraktiv wirken, sind aber riskanter, wenn die Gewinne schwanken oder sinken.
📘 Dividendensteigerungen in Folge (Erhöhungen)
📈 Was ist das?
Diese Kennzahl zeigt, wie viele Jahre in Folge ein Unternehmen seine Dividende pro Aktie erhöht hat – ohne Kürzung oder Aussetzung.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Ein langer Track Record kontinuierlicher Erhöhungen spricht für Verlässlichkeit, solide Finanzen und aktionärsfreundliche Unternehmenspolitik.
🎯 Was bedeutet das für Anleger?
- Ein langer Zeitraum mit Dividendensteigerungen stärkt das Vertrauen – besonders in Krisenzeiten.
- Solche Unternehmen gelten als verlässlich und planbar für Einkommensinvestoren.
- Je länger die Serie, desto stärker das Commitment gegenüber den Aktionären.
📘 Umsatz
📈 Was ist das?
Der Umsatz zeigt, wie viel ein Unternehmen insgesamt mit seinen Produkten und Dienstleistungen verdient – also den Bruttoerlös vor Abzug von Kosten.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Der Umsatz ist eine der zentralen Kennzahlen zur Einschätzung der Unternehmensgröße, Marktstellung und Wachstumskraft.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein wachsender Umsatz zeigt eine steigende Nachfrage und kann ein guter Frühindikator für Gewinnsteigerungen sein.
- Vergleiche von aktuellem und erwartetem Umsatz geben Hinweise auf das Marktumfeld und Analystenerwartungen.
- Wichtig: Starker Umsatz allein genügt nicht – auch Margen und Profitabilität zählen.
📘 EBITDA
📈 Was ist das?
EBITDA steht für „Earnings Before Interest, Taxes, Depreciation and Amortization“ – also Gewinn vor Zinsen, Steuern und Abschreibungen. Es zeigt das operative Ergebnis eines Unternehmens, bereinigt um bilanztechnische und finanzierungsbedingte Effekte.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
EBITDA ist eine verbreitete Kennzahl zur Beurteilung der operativen Leistungsfähigkeit – insbesondere bei kapitalintensiven Unternehmen oder im internationalen Vergleich.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein hohes oder wachsendes EBITDA spricht für starke operative Erträge – unabhängig von Bilanzierung oder Steuerlast.
- EBITDA ist besonders nützlich, um Unternehmen branchenübergreifend zu vergleichen.
- Wichtig: EBITDA ist keine offizielle Gewinnkennzahl – Abschreibungen und Finanzierungskosten werden ausgeklammert.
📘 EBIT
📈 Was ist das?
EBIT steht für „Earnings Before Interest and Taxes“ – also Gewinn vor Zinsen und Steuern. Es zeigt das operative Ergebnis eines Unternehmens nach Abschreibungen, aber vor Finanzierungs- und Steueraufwand.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
EBIT ist eine zentrale Kennzahl zur Beurteilung der Profitabilität aus dem Kerngeschäft – unabhängig von Kapitalstruktur oder Steuersystem.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein hohes EBIT deutet auf ein profitables Kerngeschäft hin – vor Zinslasten oder steuerlichen Effekten.
- Es erlaubt objektivere Vergleiche zwischen Unternehmen mit unterschiedlicher Finanzierung.
- Im Vergleich mit EBITDA zeigt EBIT bereits den Einfluss von Abschreibungen auf das operative Ergebnis.
📘 Nettogewinn
📈 Was ist das?
Der Nettogewinn ist der verbleibende Jahresüberschuss (oder -fehlbetrag) eines Unternehmens – nach Abzug aller Kosten, Steuern, Zinsen und Abschreibungen
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Der Nettogewinn ist die zentrale Erfolgskennzahl – er zeigt, wie profitabel ein Unternehmen nach allen Kosten tatsächlich arbeitet.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein steigender Nettogewinn zeigt, dass das Unternehmen effizient wirtschaftet – trotz aller Kosten.
- Die Entwicklung des Gewinns beeinflusst z. B. direkt das KGV und weitere Kennzahlen.
- Im Zeitverlauf lässt sich ablesen, wie stabil und profitabel ein Geschäftsmodell wirklich ist.
📘 Free Cashflow (FCF)
📈 Was ist das?
Der Free Cashflow gibt Aufschluss über die echte finanzielle Stärke eines Unternehmens – unabhängig von Bilanzierungsregeln. Er zeigt, wie viel Spielraum für Dividenden, Aktienrückkäufe oder Schuldenabbau besteht.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
FCF reflects a company’s real financial strength – regardless of accounting profits. It shows how much flexibility a company has for dividends, share buybacks, or debt reduction.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein hoher Free Cashflow bedeutet, dass ein Unternehmen echte Finanzkraft besitzt – unabhängig vom bilanzierten Gewinn.
- Er ist oft die solideste Grundlage für nachhaltige Dividenden und Aktienrückkäufe.
- Sinkender FCF kann ein Warnsignal sein – auch wenn der Gewinn stabil aussieht.
📘 Umsatzwachstum
📈 Was ist das?
Das Umsatzwachstum zeigt, wie stark sich die Erlöse eines Unternehmens im Vergleich zum Vorjahr verändert haben – tatsächlich (TTM) und auf Prognosebasis (erwartet).
🧮 Wie wird es berechnet?
Erwartet = (Umsatz erwartet ÷ Umsatz Vorjahr − 1) × 100
Erwartetes Wachstum basiert auf Analystenschätzungen für das laufende Geschäftsjahr.
🏛️ Wofür ist es wichtig?
Ein wachsender Umsatz ist ein zentrales Signal für steigende Nachfrage, Geschäftsausweitung und Marktanteilsgewinne – besonders bei Wachstumsunternehmen.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Wachstum ist der Motor langfristiger Wertsteigerung – besonders bei Technologie- und Wachstumsaktien.
- Wichtig ist nicht nur das aktuelle Wachstum, sondern auch dessen Nachhaltigkeit.
- Prognosen zeigen, ob Analysten weiteres Potenzial erwarten – oder eine Verlangsamung.
📘 EBITDA-Wachstum
📈 Was ist das?
Das EBITDA-Wachstum zeigt, wie stark das operative Ergebnis eines Unternehmens vor Zinsen, Steuern und Abschreibungen im Vergleich zum Vorjahr gestiegen oder gesunken ist.
🧮 Wie wird es berechnet?
Erwartet = (erwartetes EBITDA ÷ EBITDA Vorjahr − 1) × 100
Erwartetes Wachstum basiert auf Analystenschätzungen für das laufende Geschäftsjahr.
🏛️ Wofür ist es wichtig?
Ein steigendes EBITDA ist ein Zeichen für verbesserte operative Ertragskraft – unabhängig von Finanzierungsstruktur oder Abschreibungen.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Starkes EBITDA-Wachstum signalisiert operative Effizienz und Skalierung – besonders relevant in Wachstumsphasen.
- EBITDA-Wachstum ist ein Frühindikator für Margen- und Gewinnentwicklung – sollte aber stets im Zusammenhang mit Umsatz und EBIT betrachtet werden.
📘 EBIT Wachstum
📈 Was ist das?
Das EBIT-Wachstum zeigt, wie stark das operative Ergebnis eines Unternehmens (nach Abschreibungen, aber vor Zinsen und Steuern) im Vergleich zum Vorjahr gewachsen ist.
🧮 Wie wird es berechnet?
Erwartet = (erwartetes EBIT ÷ EBIT Vorjahr − 1) × 100
Erwartetes Wachstum basiert auf Analystenschätzungen für das laufende Geschäftsjahr.
🏛️ Wofür ist es wichtig?
Das EBIT-Wachstum ist ein direkter Indikator für die wirtschaftliche Entwicklung des operativen Geschäfts – unter Berücksichtigung der Kapitalintensität (Abschreibungen).
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Steigendes EBIT signalisiert wachsende operative Rentabilität – auch unter Berücksichtigung von Abschreibungen.
- Das EBIT-Wachstum ist ein wichtiges Maß zur Beurteilung von Geschäftsmodellen mit hohen Investitionskosten.
- Im Zusammenspiel mit Umsatz- und EBITDA-Wachstum ergibt sich ein umfassendes Bild zur operativen Entwicklung.
📘 Nettogewinn-Wachstum
📈 Was ist das?
Das Nettogewinn-Wachstum zeigt, wie stark der Jahresüberschuss eines Unternehmens gegenüber dem Vorjahr gestiegen oder gesunken ist – sowohl tatsächlich (TTM) als auch auf Basis von Prognosen (erwartet).
🧮 Wie wird es berechnet?
Erwartet = (erwarteter Nettogewinn ÷ Nettogewinn Vorjahr − 1) × 100
Der erwartete Wert basiert auf Analystenschätzungen für das laufende Geschäftsjahr.
🏛️ Wofür ist es wichtig?
Der Gewinn ist die entscheidende Ergebnisgröße für ein Unternehmen. Ein wachsender Nettogewinn deutet auf steigende Effizienz, stabile Kostenkontrolle und nachhaltige Ertragskraft hin.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Wachsender Nettogewinn stärkt die Bewertung, Dividendenfähigkeit und Kursfantasie.
- Stagnierender oder rückläufiger Gewinn trotz Umsatzwachstum kann auf Margendruck hinweisen.
📘 Free Cashflow-Wachstum
📈 Was ist das?
Das Free-Cashflow-Wachstum zeigt, wie sich der freie Mittelzufluss eines Unternehmens im Vergleich zum Vorjahr verändert hat – also der Betrag, der nach allen operativen Ausgaben und Investitionen übrig bleibt.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Free Cashflow ist der echte, verfügbare Geldzufluss. Wachstum in diesem Bereich ist ein Zeichen für finanzielle Stärke und steigende Flexibilität bei Dividenden, Rückkäufen oder Investitionen.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Sinkender Free Cashflow kann auf steigende Investitionen, höhere Kosten oder stagnierende operative Erträge hindeuten.
- Besonders bei Dividendenwerten ist das FCF-Wachstum wichtig – denn Dividenden werden letztlich aus dem verfügbaren Cash gezahlt.
- Ein negativer Trend sollte genauer analysiert werden – er ist nicht zwangsläufig schlecht, aber potenziell ein Warnsignal.
📘 Bruttomarge
📈 Was ist das?
Die Bruttomarge zeigt, wie viel vom Umsatz nach Abzug der direkten Herstellungskosten (Material, Produktion) als Bruttogewinn übrig bleibt – also der „Rohgewinn“ eines Unternehmens.
🧮 Wie wird es berechnet?
Auch: Bruttomarge = Bruttogewinn ÷ Umsatz × 100
🏛️ Wofür ist es wichtig?
Die Bruttomarge gibt Aufschluss über die Profitabilität eines Produkts oder Geschäftsmodells vor Fixkosten, Steuern und Zinsen. Sie zeigt, wie effizient ein Unternehmen produzieren oder einkaufen kann.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine hohe Bruttomarge deutet auf starke Preissetzungsmacht und effiziente Herstellung hin.
- Sinkende Bruttomargen können auf Kostensteigerungen oder Preisdruck hindeuten.
- Besonders im Vergleich zu Wettbewerbern liefert die Bruttomarge wertvolle Einblicke in die Geschäftsqualität.
📘 EBITDA-Marge
📈 Was ist das?
Die EBITDA-Marge zeigt, wie viel vom Umsatz als operativer Gewinn vor Zinsen, Steuern und Abschreibungen (EBITDA) übrig bleibt. Sie misst die operative Effizienz – ohne Verzerrungen durch Finanzierung oder Buchwerte.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Die EBITDA-Marge hilft zu verstehen, wie viel operativer Gewinn ein Unternehmen aus jedem Euro Umsatz erzielt – unabhängig von Kapitalstruktur oder steuerlichem Umfeld.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine hohe EBITDA-Marge zeigt starke operative Ertragskraft – unabhängig von Bilanzierungseffekten.
- Die Marge ermöglicht gute Vergleiche zwischen Unternehmen und Branchen.
- Ein stabiler oder wachsender Wert kann auf effiziente Kostenkontrolle und Skalierbarkeit hindeuten.
📘 EBIT-Marge
📈 Was ist das?
Die EBIT-Marge zeigt, wie viel Prozent des Umsatzes als operativer Gewinn nach Abschreibungen, aber vor Zinsen und Steuern übrig bleiben.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Die EBIT-Marge misst die operative Ertragskraft eines Unternehmens unter Berücksichtigung der Kapitalintensität (z. B. Maschinen, Anlagen). Sie eignet sich gut zum Vergleich von Geschäftsmodellen mit unterschiedlich hohen Abschreibungen.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine hohe EBIT-Marge zeigt, dass ein Unternehmen auch nach Abschreibungen effizient arbeitet.
- Sie ist besonders relevant in kapitalintensiven Branchen.
- Langfristig stabile oder steigende Margen sind ein Zeichen wirtschaftlicher Stärke und Preissetzungsmacht.
📘 Nettomarge
📈 Was ist das?
Die Nettomarge zeigt, wie viel vom Umsatz am Ende als „Reingewinn“ übrig bleibt – also nach Abzug aller Kosten, Zinsen, Steuern und Abschreibungen.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Die Nettomarge gibt an, wie effizient ein Unternehmen über alle Stufen hinweg wirtschaftet. Sie zeigt, wie viel Gewinn tatsächlich je Euro Umsatz übrig bleibt.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine hohe Nettomarge zeigt, dass ein Unternehmen nicht nur operativ stark ist, sondern auch seine Finanzierung und Steuerbelastung im Griff hat.
- Vergleiche mit Wettbewerbern geben Einblicke in die wirtschaftliche Qualität.
- Sinkende Nettomargen trotz Umsatzwachstum können ein Warnsignal sein – etwa für steigende Kosten oder sinkende Effizienz.
📘 Free Cashflow Marge
📈 Was ist das?
Die Free-Cashflow-Marge zeigt, wie viel vom Umsatz nach Abzug aller operativen Ausgaben und Investitionen tatsächlich als freier Mittelzufluss übrig bleibt.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Diese Marge misst die echte Liquidität, die ein Unternehmen erwirtschaftet – unabhängig von Bilanzierungsregeln oder Abschreibungen. Sie ist besonders relevant für Dividenden, Rückkäufe und Investitionen.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine hohe Free-Cashflow-Marge zeigt, dass ein Unternehmen nachhaltig liquide Mittel erwirtschaftet.
- Sie ist ein starkes Signal für finanzielle Stabilität und Ausschüttungspotenzial.
- Wichtig ist der langfristige Trend – sinkende Werte können auf steigende Investitionen oder rückläufige operative Effizienz hindeuten.
📘 Eigenkapitalquote
📈 Was ist das?
Die Eigenkapitalquote zeigt, wie hoch der Anteil des Eigenkapitals an der Bilanzsumme eines Unternehmens ist – also wie stark es sich aus eigenen Mitteln finanziert.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Eine hohe Eigenkapitalquote steht für finanzielle Stabilität, Krisenfestigkeit und gute Bonität. Sie ist besonders relevant bei der Beurteilung der Verschuldung.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine hohe Eigenkapitalquote signalisiert finanzielle Stabilität – besonders in Krisenzeiten.
- Ein niedriger Wert kann auf ein höheres Risiko oder eine aggressive Verschuldung hinweisen.
- Wichtig: Die Eigenkapitalquote sollte immer gemeinsam mit der Eigenkapitalrendite betrachtet werden. Nur so lässt sich beurteilen, ob ein Unternehmen nicht nur solide, sondern auch effizient wirtschaftet.
📘 Eigenkapitalrendite (ROE)
📈 Was ist das?
Die Eigenkapitalrendite zeigt, wie effizient ein Unternehmen mit dem Kapital seiner Aktionäre arbeitet – also wie viel Gewinn es pro Euro Eigenkapital erwirtschaftet.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Die Eigenkapitalrendite ist eine zentrale Rentabilitätskennzahl. Sie hilft Anlegern zu erkennen, ob das Unternehmen eine attraktive Verzinsung auf das eingesetzte Eigenkapital erwirtschaftet.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Eine hohe Eigenkapitalrendite spricht für ein starkes, effizientes Geschäftsmodell.
- Besonders interessant ist sie bei kapitalintensiven Firmen oder solchen mit hoher Eigenkapitalquote.
- Wichtig: Ein sehr hoher ROE kann auch auf hohe Schulden hinweisen – daher sollte sie immer im Kontext mit der Eigenkapitalquote betrachtet werden.
📘 Return on Capital Employed (ROCE)
📈 Was ist das?
ROCE misst die Gesamtrentabilität eines Unternehmens – also wie effizient es das eingesetzte Kapital (Eigen- und Fremdkapital) zur Gewinnerzielung nutzt.
🧮 Wie wird es berechnet?
Das eingesetzte Kapital ist das gesamte betriebsnotwendige Kapital, unabhängig von der Finanzierungsquelle.
🏛️ Wofür ist es wichtig?
ROCE eignet sich besonders gut für den Vergleich unterschiedlich finanzierter Unternehmen. Es zeigt, wie effektiv ein Unternehmen Kapital investiert – unabhängig von der Kapitalstruktur.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein hoher ROCE zeigt, dass ein Unternehmen sein Kapital effizient einsetzt – unabhängig davon, ob es durch Eigen- oder Fremdkapital finanziert ist.
- Je höher der ROCE im Vergleich zu ähnlichen Unternehmen, desto mehr Wert schafft das Unternehmen mit seinem investierten Kapital.
- Besonders wichtig ist der ROCE bei Firmen mit hohen Investitionen – z. B. in Industrie, Energie oder Infrastruktur.
📘 Return on Invested Capital (ROIC)
📈 Was ist das?
ROIC zeigt, wie effizient ein Unternehmen das Kapital investiert, das langfristig im operativen Geschäft gebunden ist – unabhängig davon, ob es aus Eigen- oder Fremdkapital stammt.
🧮 Wie wird es berechnet?
- NOPAT = „Net Operating Profit After Taxes“
- Investiertes Kapital = operatives Vermögen abzüglich nicht-verzinster Schulden
🏛️ Wofür ist es wichtig?
ROIC ist eine der präzisesten Kennzahlen zur Bewertung der Kapitalrendite – besonders im Vergleich zur Eigenkapitalrendite, weil es Verzerrungen durch Schulden vermeidet. Er zeigt, ob ein Unternehmen Mehrwert für alle Kapitalgeber schafft.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein hoher ROIC zeigt, wie gut ein Unternehmen mit dem tatsächlich investierten (betriebsnotwendigen) Kapital wirtschaftet.
- Im Unterschied zu ROCE wird nur Kapital betrachtet, das wirklich zur Finanzierung operativer Aktivitäten dient – und verzinst werden muss.
- Besonders hilfreich, um die Kapitalrendite von Unternehmen mit viel „überschüssigem“ Kapital oder zinsfreien Verbindlichkeiten realistisch zu vergleichen.
📘 Verschuldungsgrad (Leverage Ratio)
📈 Was ist das?
Der Verschuldungsgrad zeigt, wie stark ein Unternehmen durch verzinsliche Schulden (z. B. Kredite und Anleihen) im Verhältnis zum Eigenkapital finanziert ist.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Die Kennzahl hilft, das finanzielle Risiko und die Abhängigkeit von Fremdkapital zu beurteilen. Ein hoher Verschuldungsgrad kann die Eigenkapitalrendite steigern – birgt aber auch erhöhte Risiken bei Zinsanstiegen oder Liquiditätsengpässen.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein niedriger Verschuldungsgrad steht für finanzielle Stabilität und Unabhängigkeit.
- Ein hoher Wert kann auf erhöhte Risiken hinweisen – insbesondere bei schwankenden Zinsen oder konjunkturellen Schwächen.
- Wichtig: Immer im Kontext zur Branche und Kapitalintensität bewerten.
📘 Ergebnis je Aktie (EPS)
📈 Was ist das?
Das Ergebnis je Aktie (EPS) zeigt, wie viel Gewinn auf eine einzelne Aktie entfällt – und ist eine der wichtigsten Kennzahlen zur Bewertung von Unternehmen.
🧮 Wie wird es berechnet?
Die verwässerte Aktienanzahl berücksichtigt auch potenzielle neue Aktien, etwa durch Optionen, Wandelanleihen oder andere Umtauschrechte.
🏛️ Wofür ist es wichtig?
EPS bildet die Basis für viele Bewertungskennzahlen wie KGV, PEG oder Payout Ratio. Es macht den Gewinn für Aktionäre vergleichbar – unabhängig von der Unternehmensgröße.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- EPS hilft, die Profitabilität pro Aktie zu erfassen – und ist besonders wichtig im Zeitvergleich oder im Vergleich mit Analystenschätzungen.
- Steigendes EPS kann ein Zeichen für stabiles Wachstum oder Aktienrückkäufe sein.
- Wichtig: Verwende verwässertes EPS für realistische Bewertungen – besonders bei stark aktienbasierten Vergütungssystemen.
📘 Free Cashflow je Aktie (FCF je Aktie)
📈 Was ist das?
Der Free Cashflow je Aktie zeigt, wie viel freier Mittelzufluss einem Unternehmen pro Aktie zur Verfügung steht – nach Investitionen, aber vor Dividenden oder Schuldentilgung.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Der FCF je Aktie zeigt, wie viel liquide Mittel pro Aktie tatsächlich im Unternehmen verbleiben – wichtig für Dividenden, Aktienrückkäufe oder Schuldentilgung. Im Gegensatz zum Gewinn ist er schwerer manipulierbar und daher besonders aussagekräftig.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein hoher Free Cashflow je Aktie ist ein Zeichen für hohe finanzielle Flexibilität.
- Er zeigt, wie viel Kapital ein Unternehmen effektiv einsetzen oder ausschütten kann.
- Besonders relevant für dividendenstarke Unternehmen oder solche mit starker Kapitalrendite.
📘 Short Interest
📈 Was ist das?
Short Interest zeigt, wie viele Aktien eines Unternehmens aktuell leerverkauft wurden – also von Investoren geliehen und verkauft, in der Erwartung fallender Kurse.
🧮 Wie wird es berechnet?
Der Wert zeigt den Anteil der Aktien, der aktuell auf fallende Kurse spekuliert wird.
🏛️ Wofür ist es wichtig?
Short Interest dient als Stimmungsindikator: Ein hoher Wert deutet auf Skepsis oder negative Erwartungen gegenüber dem Unternehmen hin – kann aber auch zu einem „Short Squeeze“ führen, wenn der Kurs plötzlich steigt.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein niedriger Short Interest deutet auf Vertrauen in das Unternehmen hin.
- Ein hoher Wert kann ein Warnsignal sein – oder eine Chance, wenn sich die Stimmung dreht.
- Besonders spannend in volatilen Märkten oder vor wichtigen Quartalszahlen.
📘 Employees
📈 Was ist das?
Die Mitarbeiteranzahl zeigt, wie viele Personen ein Unternehmen weltweit beschäftigt – ein Indikator für Größe, Struktur und Geschäftsmodell.
🧮 Wie wird es berechnet?
🏛️ Wofür ist es wichtig?
Sie hilft bei der Einschätzung von Skaleneffekten, Effizienz und Personalkosten. Zusammen mit Umsatz und Gewinn lassen sich Kennzahlen wie Produktivität je Mitarbeiter ableiten.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Viele Mitarbeiter bedeuten große operative Komplexität – aber auch hohes Umsatzpotenzial.
- Produktivität je Mitarbeiter ist ein wichtiger Indikator für Effizienz.
- Besonders spannend bei stark wachsenden Tech- oder Industrieunternehmen.
📘 Umsatz je Mitarbeiter
📈 Was ist das?
Der Umsatz je Mitarbeiter zeigt, wie viel Erlös ein Unternehmen durchschnittlich pro Beschäftigtem erwirtschaftet – eine Kennzahl für Effizienz und Produktivität.
🧮 Wie wird es berechnet?
Die Mitarbeiterzahl stammt in der Regel aus dem letzten verfügbaren Jahresbericht.
🏛️ Wofür ist es wichtig?
Diese Kennzahl hilft, Geschäftsmodelle zu vergleichen – insbesondere zwischen arbeitsintensiven und technologiegetriebenen Unternehmen. Ein hoher Wert deutet auf Automatisierung, Effizienz oder hohen Wertschöpfungsanteil hin.
🧮 Berechnung
🎯 Was bedeutet das für Anleger?
- Ein hoher Umsatz je Mitarbeiter spricht für ein skalierbares und margenstarkes Geschäftsmodell.
- Ein niedriger Wert kann auf arbeitsintensive Prozesse oder geringere Wertschöpfung hinweisen.
- Besonders hilfreich beim Vergleich von Tech- vs. Industrieunternehmen.
NVIDIA Aktie Analyse
Analystenmeinungen
68 Analysten haben eine NVIDIA Prognose abgegeben:
Analystenmeinungen
68 Analysten haben eine NVIDIA Prognose abgegeben:
Beta NVIDIA Events
🇩🇪 Neu: Alle Transkripte jetzt auch auf Deutsch verfügbar!
Abonniere Premium, um Transkripte und KI-Zusammenfassungen auf Deutsch zu lesen.
Vergangene Events
|
JUN
4
Bank of America 2026 Global Technology Conference
vor 20 Tagen
|
|
MAI
28
TD Cowen's 54th Annual Technology
vor 27 Tagen
|
|
MAI
20
Q1 2027 Earnings Call
vor etwa einem Monat
|
|
MÄR
17
Shareholder/Analyst Call - NVIDIA Corporation
vor 3 Monaten
|
|
MÄR
16
NVIDIA GTC AI Conference 2026
vor 3 Monaten
|
|
MÄR
4
Morgan Stanley Technology
vor 4 Monaten
|
|
FEB
25
Q4 2026 Earnings Call
vor 4 Monaten
|
|
FEB
3
Second Annual AI Summit
vor 5 Monaten
|
|
JAN
12
44th Annual J.P. Morgan Healthcare Conference
vor 5 Monaten
|
|
JAN
5
Special Call - NVIDIA Corporation
vor 6 Monaten
|
|
JAN
5
Special Call - NVIDIA Corporation
vor 6 Monaten
|
|
JAN
5
CES 2026
vor 6 Monaten
|
|
DEZ
2
UBS Global Technology and AI Conference 2025
vor 7 Monaten
|
|
DEZ
1
Special Call - NVIDIA Corporation
vor 7 Monaten
|
|
NOV
19
Q3 2026 Earnings Call
vor 7 Monaten
|
|
SEP
18
Special Call - NVIDIA Corporation
vor 9 Monaten
|
|
SEP
8
Goldman Sachs Communacopia + Technology Conference 2025
vor 10 Monaten
|
|
AUG
27
Q2 2026 Earnings Call
vor 10 Monaten
|
|
JUN
11
Shareholder/Analyst Call - NVIDIA Corporation
vor etwa einem Jahr
|
|
JUN
10
Rosenblatt’s 5th Annual Technology Summit - The Age of AI 2025
vor etwa einem Jahr
|
|
JUN
10
Nasdaq Investor Conference 2025
vor etwa einem Jahr
|
|
JUN
4
Bank of America Global Technology Conference 2025
vor etwa einem Jahr
|
|
MAI
28
Q1 2026 Earnings Call
vor etwa einem Jahr
|
aktien.guide Basis
NVIDIA — Bank of America 2026 Global Technology Conference
1. Question Answer
Good morning. Welcome to day 3 of The Bank of America Global Technology Conference. I'm Vivek Arya, I cover semiconductor, semi cap equipment. And I'm really delighted and honored and it's a real treat to have Colette Kress, Executive Vice President and CFO of NVIDIA, to join us for the keynote session this morning, fresh off a number of announcements from GTC Taipei?
So perhaps, Colette, if you could start with maybe giving us right, some sense of what NVIDIA announced, right? How it kind of fits your strategic direction, and then we can go into a few other questions. But thank you so much for joining us.
Thank you for having me. I'm going to give you 1 quick statement at the very beginning here. As a reminder, this discussion may contain forward-looking statements, and investors are advised to read our reports filed with the SEC for information related to our risks and uncertainties facing our business.
Yes, GTC Taipei was a really, really great event. We do enjoy going to Taiwan, meeting with such an important part of our suppliers and helping our suppliers and that full ecosystem, really understand the progress we're making and speaking to them directly on that. So there were great announcements that we did make.
One of the top announcements is let's not forget Vera Rubin and is on its way and definitely is in full production. As we had discussed part of our earnings we indicated that, yes, it is planned for the second half. But more importantly, we talked about it, it is coming soon. It's ready for Q3.
So Q3, we're looking forward to standing up Vera Rubin. As you can imagine, we're already in full production in order to make that plan within Q3. But that was one of the very first pieces of it. The other piece is some of the excitement that folks have talked about, which is referred to as Vera CPU. Vera CPU is a great opportunity for us to both continue to expand with an important area of agentic solutions. When you think about agentic solutions, that CPU is going to be an essential part, the director of directing that Agentic work together. And we have the ability, again, to do an extreme co-design with what we enabled with a CPU.
That extreme codesign is different than many other types of CPUs that are out there. It is on our own cores. It also has a tremendous improvement in terms of productivity, but also in terms of just its sheer performance is about a 2x versus any of the other x86 CPUs. We've added this to our portfolio, not only within our full systems that we have for Vera Rubin or even what we have with our existing Blackwell, but this is also an opportunity to sell stand-alone, and we do believe that will be a big opportunity for us.
This is also a time to talk about what we can do in terms of the PC. We've been working very feverishly with many different providers to help us on that, but this is putting together RTX Spark into the market. This is a great opportunity for AI types of PCs and very, very key in focusing those that will be doing agentic type of work as well and using that PC with the performance is built together with MediaTek as well as our GPU and that's a great opportunity for us.
Lastly, this was a discussion to talk about really the diversification of both what we are building, but also the different types of customers and users that we see worldwide.
We're in a unique position versus any other type of company. And you'll hear more of that, I think, in our discussion, when we talk about not only just the hyperscalers, but a very important group of AI clouds and what they have built in terms of the market. We have so many different folks that we need to help with, that will be the enterprises, that will be the industries, that will be sovereigns, and we have that complete diversification and everything that we're selling. So those were some of the key highlights that we had.
Excellent. So maybe let me pick up, Colette, on at point because I think one of the very important disclosures you made as of the last earnings call, was giving us this transparency, right, between hyperscalers. And I was struck by 2 things. One is that both businesses, both your sales to kind of the large public hyperscalers and this other business to kind of the neocloud sovereign, right, on-premise, they're about the same size, but the hyperscaler business is growing faster. Like I would have thought that the other business would be smaller and growing faster, but it's the same size and growing slower. So if you could just walk us through what is the right way to interpret that disclosure?
Okay. So let's step back and talk about our disclosure we've had. We've been providing, of course, the data center numbers that we've had. One of our focus was to provide you an understanding of compute and networking. What was interesting is we're going on our now third generation of an extreme co-designed, full data center scale system. We're taking that and breaking that down to show you compute networking.
But if you think about it, all of those full systems are always going to incorporate both our compute and networking together. And our attach rate of what we are seeing in terms of the networking is an enormous piece of that as well. We're probably more than 90% attach rate in terms of what we're doing on networking. So now let's step back. What did we want to show you with a new way to think through what we're doing.
You get asked all the time in terms of what percent of our revenue was the hyperscalers. And that was, we had indicated for many years, many quarters, indicating that it's approximately 50%, plus or minus any single 1 quarter in terms of what we see. Now we are growing underneath that. So when you're saying it's 50%, that full growth coming from hyperscalers there. They're a very important group in terms of what they have done in terms of standing up clouds. But what's interesting is the other 50% and this is what we brought attention to, which is unique to NVIDIA in terms of our AI focus clouds, okay?
What we're looking at in terms of this group, in terms of the ACIE is what we are calling it, is a focus in terms of these types of AI clouds. These AI clouds are newly standing up, not a general-purpose cloud that has been there, but focusing on either AI factories, but most importantly, focused on our reference architecture because they need to know that full stack in order to put that into space.
So what they've done is a fast-moving capability to serve enterprises, to serve the industries and serve many of the different countries and regions that we see. We have AI clouds, definitely here in terms of the United States, you see a building already in terms of Europe, and now also big areas in terms of Southeast Asia as well. These are important areas to serve this market. Folks are not going to move back into their own enterprises data centers to build this. They need an AI cloud builder. This is where the token generations and pieces will be.
So interestingly, that's the other 50% and such a fast growing, as Jensen actually communicated, likely the fastest-growing piece, both their speed and the need to serve such a big market, you've got that ability now with this group. So we'll continue to be providing you that information broken out in that manner. There's a lot of interesting things happening in that.
Got it. And what I also found interesting is, despite this kind of daily debate, discussion noise about how much ASICs might be impacting your business, your business in hyperscalers, if I have my numbers right, was up like I think 115%, right? So it was definitely at a very strong pace. Are you well represented at all hyperscalers? Is there some concentration at certain ones, where people have custom chips? So how is how are hyperscalers kind of making this determination between where should they deploy their ASICs, where should they deploy NVIDIA GPUs?
Yes, very good question to talk about the word diversity. Because if you listen to each of the different groups, they're communicating the diversity, but let's step back and say, the king of diversity is probably NVIDIA and everything that we have done, because each and every cloud that is there, whether that be a hyperscaler cloud and/or an AI cloud, we are also being a part of them.
Also, when you think about the AI model makers, when you think about the foundation models, they are also 100% on our platform as well. So we have the diversity of both what the types of models there are. We have the types of clouds that it are. Meeting that together, we commend to be the biggest drawer of the diversity going together.
The cloud operators, they have continued to use us not only for today, tomorrow, but in the future because what they are doing is building a significant amount of understanding of how we have designed at an extreme codesign. Most of them are actually continuing to sell what we have in the cloud for them, whether that be our Hopper architecture, whether that be our Blackwell architecture and our future of Vera Rubin. What that means it's sustainable to stand up, and we continue to watch it grow and grow in this area that they will be able to serve so many of the different markets because of what we have built in a full ability for them to not only do end-to-end types of solutions, but we can be ready for what we did at the very beginning of ChatGPT to what we are seeing now in terms of agentic. And that diversity of all of our different customers is very key. Nothing that any other type of builder really has because we are supported and very helpful for them.
Got it. The next topic I wanted to touch on, Colette, was that there is this perception that as the market is moving from training to inference that NVIDIA dominated the training phase, that inference is going to be a lot more fragmented, although when I look at these kind of growth rates, it tells me you're participating well in it.
So how do you kind of frame the discussion that is it a fair pushback to say that inference, right, like I saw this one headline, which said, "Oh, the center of gravity is moving from the GPU to the CPU. Right?" So are you starting to notice a different kind of competition as the market moves towards -- like what percentage of your business today is in inference versus training as an example?
Okay. So let's step back on the 2 pieces here, basically thinking about there's a training part of it and there's an inferencing part of it. The inferencing part is extremely important for many of these model makers or any of these new companies starting. Why? Because once you're moving to inferencing, now you've got revenue. As you've seen the growth right now in agentic types of solution, it is -- the only way to describe it is the growth is vertical. The percent is not the most important thing. It just -- it is vertically going up in terms of that. Once you reach a point that you are now at revenue, the most important next thing that you want to think about is how do I make sure I can get a profit. And how can I make sure that I am using the best-of-breed in order to serve that.
Therefore, for those tokens, I want as many tokens as possible in the shortest amount of period of time, and I want the most efficient use of that type of compute. And that's where we come in. Because when you think about our systems, they've all been designed, thinking about these models, thinking about the full software stack that is also necessary for it, but being the most productive as well as the most lowest cost.
So this inferencing is not any easier. It's actually harder if it is just a static ASIC because we have been continuing to design over and over again. We still look at this as helping them focus on what type of use they want to use for the inferencing because it's not that simple that it's one piece as well. As you're working right now that says low latency as we go forward, there is a piece of agentic that says, I'm going to go back to the director of the CPU and ask that CPU to assist in this piece of it. But again, that hard work is still going to be necessary on our full end-to-end systems and with that important part of the GPU as well.
So what percentage is of our inferencing? Almost every one of our systems, both is using for training, but is also easy to move over to also do inferencing. With Grace Blackwell, it was the first time that we had also seen inferencing first for some of the systems that are built. Most of those AI model makers that you see right now and the models that they've put out there, Grace Blackwell is very, very key for much of the inferencing that is happening.
Got it. One other thing that I've noticed is over the last few years, we saw a lot of deployments, and I think it was harder for investors to kind of connect the dots to ROI and monetization. But this year, we have actually seen much better -- Is that a fair representation of what's going on? What are you hearing from customers? Are they noticing real ROI now, right? And what has kind of created this switch?
That's correct. The ROI has absolutely represented a significant amount with the cloud providers. The clouds were already making a great return as they continue to sell on the cloud. Now when you are also supporting those foundational models and what their work is doing in agentic, you are seeing not only enterprises focus, but you have the entrance of consumers as well.
Folks working on agentic types of work even at a personal level is also very key that is enabling the revenue, which is therefore, enabling more and more profit that allows them again to serve more. Most of them right now are in a position that says the compute is tight. There is still a shortage of supply of compute stood up that they want more and more that they are trying to get, and we're working with so many of them to help them as this agentic plan that we knew would be a great part of the next stage is actually almost a greatness that is causing, again, more and more compute needs, and we're going to work very hard to serve many of that.
Got it. We have seen just insatiable demand, right, from these public LLM companies, if I can call them that. Now the skeptic will say that they are trying to get their hands on compute wherever they can find. Does it not mean that hardware to some extent is being commoditized because of that? Because if they don't care whether they're using a 5-year-old GPU or the latest generation GPU, they're just trying to get compute whatever. So does that speak to commoditization? Or does it -- that just says that there is just a...
It's actually the opposite. It's actually given that full strength of agentic, it is forcing them to say, I've got to find the best-of-breed to do that. I agree that they're working on many different solutions to try and put that together, but you really have to look at where would the lion's share of that come from. We tend to continue right now growing our capabilities, growing our ability with more and more supply, and we are serving a vast majority of what is needed right now in this agentic role.
And I think that's going to continue. The perception that, hey, any chip will work, we're not a chip. It does take a full end-to-end data system to do that because you really have to think about the work that needs to be done in agentic starts at the onset of the information landing in that data center all the way through to the end. No simple one chip would be able to solve that. We think about all the 7 different chips that we'll even have with Vera Rubin fully designed for these types of solutions that we can answer all different types of questions all within one full system. And that's why we are continuing to be looked at.
We have certainly engaged with a lot of different companies matchmaking with them, where how can we help them both obtain the land, power, shell, how do we help them in terms of standing up the compute as fast as possible for what they need to do.
Got it. Given you mentioned land, power, shell; those are very, very well, I think, recognized constraints. Do you think they are constraints to a point where your existing customers might say, "You know what, it might make more sense to just upgrade my existing GPU infrastructure." Like has that upgrade been a factor so far? And when does it start becoming a factor?
Yes. It's an interesting debate that says you have data centers that may have already incorporated the Hopper architecture. What is interesting right now is the Hopper architecture in the cloud is actually earning more money than the very first day that they had it because that has been something that has continued to improve, whether it be Hopper, whether it be Blackwell, and you're going to see the same thing with Vera Rubin.
We continue to improve it with software to align with what currently is happening in the market to provide that with inside those systems. So that is actually helpful for them. A, the depreciable life may be a certain amount of time. The useful life is very long. If you think about it, why they want to keep it up and running, the time down to then build another data center versus this is actually quite useful for us is much easier for them right now.
So particularly in this time, the brand-new data centers are also important because you want your best on that. If you spent hours, years trying to find the exact land, power, shell given that it can be a very tight area right now to find, you're going to want the best. You're not going to sit there and say, I'm going to use something and give it a try. They know that NVIDIA's architecture has continued to ramp for this, and that's why many of them are on that plan for all of these different data centers.
Got it. So right now, to your point, the focus is obviously greenfield, it's a lot of the newer products and then existing, it's try and get as much, right? So upgrade is not yet a factor. But do you think there is a point at which that trade-off makes more sense that you can generate so many more tokens, right, and generate so much more revenue that, that offsets whatever depreciation benefits you could get from an older generation product?
At some point, it will get to be there. But it's an interesting time right now given the tightness. Those new data centers that are being built are absolutely built with the most sustainability, but also thinking about the efficiency of power and most of them are liquid cooled. When you're looking back in terms of a Hopper, you'd have to change that data center. It will happen. They will move that. But right now, moving ahead, getting ready for Vera Rubin is what the key thing they're doing.
And I'm glad you mentioned software because I find that sometimes that is sort of underappreciated about the full stack aspect of the platform, right? And the reason one can extract a lot more value even from existing hardware, right, is because of that constant innovation in software. So maybe talk to us about what is the role of software? What is the role of having all these domain optimized libraries and developers because sometimes I find that people just kind of treat it with a throwaway line of, oh, the CUDA moat is broken. And I imagine it's a lot more than just one operating system moat.
If you discuss with any of the foundation model makers or any of those creating these types of solutions, they are all focused, yes, as the underlying CUDA, but those key things is the domains and the libraries of software that we have. With you think through any type of AI cloud that is standing up. One, they're working on, can I design a data center? Can I stand this up? They need a full end-to-end architecture. They don't have thousands and thousands of software engineers that for over the last 20, 25 years, have designed software that is both backwards compatible and forwards compatible as much as they are leveraging us and CUDA.
So our work in terms of those libraries for each of the industries, each of the enterprises is going to be fundamental, particularly to the AI clouds and the work there. They can get a full stack. They don't have to rethink, redesign it. But it's more important to that, even within the cloud and you have the foundational models in the cloud, they too are also very keen on to that stack of software that we are working together with them to continue to enhance. We have our own models with inside NVIDIA. Those models are leveraging our software and building upon our software because we do also understand how models are working.
Absolutely. And none of that is something that ASICs can do?
Absolutely not. It doesn't have the ability. It's a fixed. It was determined at the point that they went and designed it, the time that they went to go tape it out, it's done. It doesn't have the ability to change over that. But we have that because we have a full platform to revise any part of that from here and going in the future.
Got it. I wanted to touch on 2 more topics in the last few minutes. One is on constraints and then the CPU side. So on the constraint side, given the strong growth that you're seeing, what are you doing to -- are there constraints? Like what is the constraint? How are you dealing with it? And can the fact that you have prepared for it, is that a competitive moat by itself also in this industry?
So it's been interesting to watch as we knew the growth, and we have been continuing to talk about the growth going forward. Remember, we think about 2025 through 2027, if we think about Blackwell and Rubin together, that's $1 trillion.
We knew that. We knew that, that opportunity would be in front of us, and we knew it would probably exceed that. What that means is you have to be thinking about your suppliers, your full ecosystem because I would say it's not just the supplier piece of it, you also have to think about the land, power, shell as essentially a supply. It is a principal piece of what we need to do to complete. So our work is long-standing.
Our supply is not just ordering what is there. Remember, often, we are in deep design with many of our different suppliers at the very onset of what we are putting into our full systems. That design product is working with them way ahead of where we are now in market. You would say probably more than 3 years ago, those are where some of those discussions started. And then that is a continuous ongoing focus in terms of the supply we need. We have a significant amount of purchase commitments. It's a number that I don't think any significant company would look at that and say it's a small number. We're essentially at about $124 billion of commitments.
Now the entire logic semiconductor industry was that a few years ago.
It's an enormous number. But keep in mind, there's an ongoing every day. You never know what day you are fully where you need to be from a supply. Demand arises the next day, and we're in that continued discussion with our suppliers. No one's immune to the supply constraints. We all have our continued work trying to help them with it. But given our size and given our long-standing work with them, we can continue to think long term with them.
It's important to understand it's not necessarily how they divvy up supply, the question is how do they stand up the capacity, the manufacturing lines to continue what they're doing. And that's where we often come in, how can we assist, how can we help you understand what is going to be necessary? The first question is, how many manufacturing facilities do I need to support NVIDIA?
So it's an important area. I can make the same type of statement for land, power, shell. How do we help them stand that up? How do we align that with people that are interested and didn't have enough compute, but now need more land, power, shell, we can also be continuing to work that ecosystem as well.
Right. So despite the inflation we see in everything around us, do you think NVIDIA can kind of sustain the kind of margin levels despite all this because of the planning work and prepurchase commitments?
Nothing as we see moving forward changes to where we are today. Even when we think about putting Vera Rubin in market. We have done so much work and is very well appreciated with many of our customers, both on the manufacturing side and the cycle time. How long will it take me to get this product into the first day of revenue from what they're putting in together. And they've actually really seen a true difference. We're on our third generation, and that is moving very fast, but also very efficient, not just focused on the system, but the full manufacturing of what it takes to get that up and running fast. So at this time, I don't see anything different going forward. Does this continue for us? Yes. And we'll find out more later, but that's where we stand.
Understood. Before I ask about the CPU, one other thing I kind of remember from what was said at GTC Taipei was the ability to improve monetization, right, per gigawatt. I think there was a kind of a road map given of from $40 billion towards $60 billion to $80 billion. What is driving that? Because those are pretty big numbers, right, and pretty serious. I mean that's like your content expansion with every new generation.
Every new generation takes more and more focus on every last piece of that system and getting it accelerated, okay? Not focusing on a very single one chip. One of the comments that we've made in terms of Vera Rubin, it's 7 different chips. When you think about the importance of just the switching capability, the connectivity of moving, how fast can we move things across? How can we make sure all different types of asks of the compute are being held and accounted for in terms of that?
So that's where the CPU now becomes an important part in terms of how we put that in there. It also thinks of all of the different switches. It thinks about in terms of the adapters and pieces of that. You've got a BlueField opportunity included in this. And then also what we can do just from a sheer GPU and the overall development of that. So you bet, we're going to continue to see not just because we're focusing on the chip, it's because you focus on that full data center and that capability.
Got it. Makes sense. And then just the final topic in the last 90 seconds, which is on the CPU side. So what drove your decision to invest in the CPU? It's not the biggest part of the content. What has made it more important? And why invest in ARM, right? Why not just -- I mean, you were using x86 in many of the head notes before, why not just continue to stay with that?
Yes. The design that we've done is very well aligned to agentic type of solutions, that verification of the agentic work that needs to be done. So the hard work sits with that GPU and now that CPU is really right in line for what that's going to be necessary. However, it doesn't mean that you just want to build something off the shelf because you still want that connectivity of the software, how does all that work together. It is more performant. Remember, it's a 2x more performance, which is an important piece. But it is also our own cores in terms of what we put together.
If you don't mind, I'd love to take the last 30 seconds here to remind, though, in terms of where we've really gotten to in a very good place in terms of returning capital to our shareholders.
Yes, of course.
I know it's a really great opportunity. You know that we've had to focus in terms of on our suppliers, our ecosystem to build so much of what we've done. But given now where we stand with a significant amount of cash flow generated from these full systems, the amount that we can return to shareholders 50% or more absolutely is a key focus of ours, not just for today, not just for tomorrow, but for the long term, we'll continue to focus on this. But we also added the dividend. Our dividend has been with us yes, for 13 years.
Since you joined.
That is correct as far as when it started. And it's now time to also serve a very important market that also has the ability to, from a capital return, choose in terms of our dividend as well. A great opportunity for $1 a share per year. And that is, again, an area that we will continue to focus on for growth, not just our share repurchases, but also our dividend. So I just want to make sure that I clear.
Why limited to 50%, why not 75%?
We're working on it, and we'll let you know in terms of there. But how about we'll go through this year at $1 each.
Maybe next year when you're here for the keynote, by that time.
When we'll come back next year, we'll talk about it. Sounds good. Thank you.
Thank you so much, Colette. Really appreciate your time.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Bank of America 2026 Global Technology Conference
NVIDIA — Bank of America 2026 Global Technology Conference
NVIDIA-CFO kündigt Vera Rubin für Q3 an, stellt eigene Vera‑CPU (2x vs x86) vor, betont Software-Moat, Lieferplanung und Kapitalrückfluss.
🎯 Kernbotschaft
- Kernaussage: Vera Rubin (voll integriertes KI‑Datacenter-System) ist in Q3 produktionsreif; zusätzlich wird eine eigene Vera‑CPU als Co‑Design ergänzt, plus Initiativen für KI‑PCs (RTX Spark) und stärkere Diversifikation der Kunden zwischen Hyperscalern und spezialisierten AI‑Clouds.
⚡ Strategische Highlights
- Vera Rubin: Drittes Generation‑System, Full‑stack‑Ansatz mit ~7 speziellen Chips zur Optimierung von Training und Inferenz.
- Vera CPU: Eigene CPU‑Kerne, laut Management ~2x Performance vs. x86‑Alternativen; soll sowohl in Systeme als auch als Stand‑alone angeboten werden.
- Software & Ökosystem: CUDA, domain‑optimierte Bibliotheken und Partner‑Co‑Design (z.B. MediaTek für AI‑PCs) als Verteidigung gegen reine ASIC‑Konkurrenz.
🆕 Neue Informationen
- Produktzeitplan: Vera Rubin bereit für Q3; Produktion bereits angelaufen.
- CPU‑Eckdaten: Betonung auf eigenen Kernen und 2x Performance gegenüber x86; Verkauf auch einzeln geplant.
- Lieferplanung: ~ $124 Mrd. an Einkaufsverpflichtungen zur Sicherung Kapazitäten; Fokus auf Land/Power/Shell als knappe Ressourcen.
- Kapitalrückfluss: Einführung einer Dividende von $1/Jahr; Ziel, ≥50% des Free Cash Flow an Aktionäre zurückzugeben.
❓ Fragen der Analysten
- Hyperscaler vs AI‑Clouds: Warum beide Segmente ähnlich groß sind, aber AI‑Clouds schneller wachsen – Management sieht AI‑Clouds als wichtige, schnell aufbauende Kundengruppe.
- ASIC‑Risiko & Commoditization: Nachfrage nach Ganzsystemen, Software und Flexibilität reduziert Ersatz durch fixe ASICs; NVIDIA sieht Differenzierung weiter intakt.
- Constraints & Supply: Engpässe sind nicht nur Chips, sondern Land/Power/Shell; NVIDIA adressiert das über langfristige Co‑Designs und Vorverpflichtungen.
⚡ Bottom Line
- Fazit: Konkrete Produktläufe (Vera Rubin Q3, Vera CPU) plus massive Lieferverpflichtungen und klare Kapitalrückfluss‑Signale stärken NVDA‑Wachstumsaussichten und die Marktmacht im Full‑Stack‑KI‑Servergeschäft; kurzfristige Risiken bleiben bei Supply‑ und Infrastrukturengpässen.
NVIDIA — TD Cowen's 54th Annual Technology
1. Question Answer
Thank you, everybody. Welcome to day 2 of our 54th Annual TMT Conference. Really pleased to be joined on stage by Sean O'Loughlin, who heads up our networking coverage; and Gilad Shainer of NVIDIA? How did I do?
Almost, almost. Close enough.
All right. I think my bosses are in the room so I am obligated to ask you for an Extel vote, if you think we've earned it this year. And if the Wi-Fi password wasn't subtle enough, we'd really appreciate it. Gilad, maybe just to start with that out of the way, you guys reported earnings last week. The networking numbers you gave, I think, were at $14.9 billion, up 199% year-over-year. A lot of that is obviously captive in your NVL racks, but maybe you could walk through what are the key components that are driving all the momentum you're seeing on the networking side?
Yes. So just to tell a secret, I got to pick on the questions beforehand. And the original question was 199% growth and nearly $15 billion of revenue. And I couldn't sleep at night yesterday because I tried to figure out who wrote the question. 199% and nearly $15 billion, right? It couldn't be an engineer because an engineer would say 199% and $14.8 billion. And it could be marketing person because it was nearly $215 billion, right? So I couldn't sleep at night, sorry for that, I try to figure out. But he corrected now. So when you look on what we built and what we design, we designed a single unit of computing. We designed an AI factory, which is a single unit of computing.
And when you design a full data center, full AI factory that needs to behave like a single unit of computing, there is a lot of infrastructures, a lot of networking infrastructures that you need to bring into that AI factory to make it work like one. There is scale up with NVLink. There is scale out. In scale out, we have InfiniBand as one option, and we have Spectrum-X switch as another option. We have scale across that we're using with Spectrum FGS and then we have introduced a new storage infrastructure with BlueField as a storage processor. And we also have an access network that we're using BlueField as a device to have -- enable access into the AI factory and provide all the security capabilities and so forth.
All of those networks, all of those areas, infrastructures are growing. So we see growth in NVLink as a scale-up domain, we see growth on InfiniBand and Spectrum-X Ethernet as a scale-out domain. And we see growth in BlueField as a storage processor, as also a DPU to enable access. So there's growth on all those infrastructures, all those elements and that contribute to the numbers that you mentioned.
Okay. I'm going to go back in time all the way to 2020. NVIDIA made the acquisition of Mellanox that brought you and your team over. We've referred to this on our team as perhaps the most important and successful technology M&A has ever happened. Can you talk about how that deal came together? What did NVIDIA see and why they felt they needed to bolster that networking asset so early? And how is it paying dividends now? And what are your expectations going forward as well?
Yes. There's another thing that I saw in the questions, by the way. Those are very long questions, very long questions. I'm an engineer. So if you have more than 4 words and a question, I completely lose -- so I need to recap what you ask. How the acquisition happened. I think it was simple. Jensen came, we talked and he put a deal and we signed it. That's it. Simply as is. I think that Jensen saw that the world needs computing data centers or accelerated data centers, AI factories. He saw that NVIDIA needs to become a computing company, not a device company, not an AC company, but a computing company.
And the way that you connect computing ASICs will determine what those compute ASICs can do. If you connect it in one way, you just got server farm. If you connect it in a different way, you actually can build a supercomputer. So in order to go to a direction to enable the company to become a computing company, you need to bring the right networking infrastructure that enables all of that magic. And I think this is what you saw in Mellanox. And that's the reason that he came, we talked, there was a love at first sight, put a bid and we agreed, and we joined NVIDIA. Joining NVIDIA, Mellanox was kind of one team. There was no -- there are no different business units in a sense, Mellanox was one team. And we were focusing on building networking infrastructure for distributed computing workloads.
We build a great technology that used in high-performance computing and AI is another example of distributed computing workload, and that's why Mellanox was a great fit to NVIDIA. When we joined NVIDIA, when I joined NVIDIA, it was a great experience because NVIDIA actually behave and work the same as Mellanox. It's 1 unit. It's actually 1 unit. There's group discussions, group meetings, networking and compute and infrastructure, all work as one team. And same as Mellanox. So it was actually felt like home. A larger home. There's more people in that house, more rooms in that space, but it felt like we didn't leave Mellanox. It was a great experience and it still is.
Okay. For your direct feedback, I'm going to ask 2 questions at once.
That's going to be -- I'm not going to remember the second question.
We'll get through it together. And then I'll pass it to Sean to ask about scaling up out across and diagonally. So I think there's been -- you guys have shifted from selling GPUs to selling fully integrated racks. And I think there has been some pushback on -- from ecosystem partners that don't like being captive into one not having optionality, which components to pick and choose. Can you talk about the pros and cons of that go-to market? And then also, how NVLink Fusion came about? Was that a reaction to this trend and what that offers your customers?
Yes. Well, you did combine 2 questions. So when you build a supercomputer, when you build an AI factory, you need to build it as one unit because that's actually the compute unit. And when you build 1 compute unit that had a lot of components inside, you need to have extreme co-design that combines the software and the hardware and the compute ASIC and the networking ASICs and storage element and so forth because you build 1 unit. So we design it vertically. Everything needs to work as a balanced system. If one element does not give what the rest of the elements are required, and that system will not work, okay? When we deal with distributed computing workloads, I'll give you one example. When you deal with distributed computing workloads, you need all the compute ASICs to work like one. If one of those ASICs, let's say, have hundreds of thousands of GPUs in my factory in my data center. If one of that GPU ASIC gets data a little bit late versus all others, all others will wait, okay? That's how serious it is.
And therefore, you need to design it vertically. But after we design it vertically and we bring all the co-design elements and making sure that everything works as a single unit, we actually sell it horizontally. You can take pieces. You can take pieces of it. You can take the GPU, you can take the CPU, you can take the networking, you can take NVLink separately, and then you can mix and match with your own designs, if you want to. So what we do, it's actually vertically, but everything can be used as a different separate unit. And nothing is kind of closed. Everything is very open. All the interfaces are given, are known, you can actually put your own software and own modifications and your own enhancement on top of what we do.
And therefore, we can choose what you want to take. NVLink Fusion, you mentioned NVLink Fusion and that's actually an answer that it's not a black box because everything we design, we are so proud of them, then we're happy if you take any piece of it. So NVLink Fusion because it's -- I think it's the only scale up network that is proven from performance and from a production perspective. And if we build something that great, why don't we want our customers and partners to enjoy that as well even if they have their own CPU or even if they have own GPU that they have built and they want to use it. And therefore, nothing is a black box. All the components are available. You can choose, you can mix and match. And Fusion actually enable our customers to also take NVLink as a separate element if they want to do that. And we're also working with an ecosystem. So we have already made announcements on our partners and customers that are part of NVLink Fusion ecosystem or using NVLink Fusion for their own AI factories.
I wanted to pivot a little bit to some more geeky and more fun questions about tech rather than these lame business questions. Maybe if I could just ask an open-ended question about Spectrum and its approach to Ethernet in a system way where there's both intelligence on the NIC side and within the switch as opposed to a more purely switch-centric architecture. What are the benefits on the spectrum side? And how does that translate both in a training environment and in a more distributed inference environment?
Yes. Well, we can take an hour to answer this question. So if you have time, when we start working on Spectrum-X Ethernet. Well, the reason that we start with Spectrum-X Ethernet, first, we have InfiniBand, and we still have, and it's growing, and it's -- it's one of the best technologies ever created for distributed computing workloads. That's why Mellanox did so great in high-performance computing. And if you look on high-performance computing, supercomputers, you're going to see a lot of InfiniBand there. It was built for low latency. It was built to eliminate jitter, which is a key element and so forth. But as AI is growing and AI, every data center become accelerated, every data center becomes an AI Factory. We needed -- we also need to bring an option for Ethernet because we have customers that invested in Ethernet.
They know how to run Ethernet. They build their software management on top of Ethernet, and it's going to be hard for them to go and do something else. So we have InfiniBand for people to use InfiniBand, and we also wanted to design an Ethernet version that can also be used for scale-out that can also be used for AI workloads and distribute computing workloads. Now when people refer to Ethernet, it's important to note that there is no one Ethernet out there. There are different kinds of Ethernet and different kinds of Ethernet that were developed for different kind of workloads.
There is Ethernet kind that was developed for high virtualized small Radix infrastructure. There is another kind of Ethernet that was developed for a single server workloads, large cloud infrastructures. There's another kind of Ethernet that was developed for telco and DCI and kind of long distances and based on the deep buffers approach and so forth.
The issue that we had is that none of those were built for distributed computing workloads. None of those were designed to eliminate jitter. Jitter was fine. If I build Ethernet for single server workloads, I don't care if there is a skew in time between 1 server to another server because there is no communications between them. If I'm building something for a long distance or DCI and I base it on deep buffers, I actually based it on creating jitter, okay? So none of them were dealing with jitter and jitter is the biggest problem when you deal with distributed computing workloads or AI training and inferencing, which is our example for distributed computing workloads.
And that's the reason that we actually created Spectrum-X. And Spectrum-X is the only Ethernet that is purposely built for AI. Now something that we learn from InfiniBand is that there is no way to build a network that's going to eliminate jitter and do that on a single device. No way. And it's simple to explain it, okay? Data that comes out from the GPU goes out in an order same as we speak. There is an order of the words. Data that's going to be written to a remote GPU needs to get to that remote GPU memory in order. And if that data is going to go through a switch and that switch needs to maintain that order, then that switch will introduce jitter.
And the reason is that every switch has a lot of ports that you can use. There is a lot of path in the network. If the switch will start doing distribution of every packet can go to a different route to a different flow because there is less busy road that I want to use, then that will create by definition out of orderness in the delivery of data. And that means that I cannot use it on the other side. And if you look on all the designs of the off-the-shelf switches that exist today, they are actually based on not creating out of orderness so they're using approaches like flowlets, which means if there is a flow, I'm going to keep that flow even though there is an empty road that I can get it faster.
No, I'm going to keep it the same path because the data must get in order to the other side. That's your enemy, okay? That's how you create jitter. And we didn't want that to happen. We actually wanted to make sure that there is no jitter. And in Spectrum-X Ethernet, the switch needs to unconditionally distribute traffic across the entire infrastructure that exists. The switch will choose for every packet a different port. What is the fastest path, what is the less busiest path I'm going to use. And by definition, I'm creating out of order of data delivery. And in order to put the data back in order, I need a SuperNIC on the other side.
So I'm using RDMA because RDMA enables me to put the data directly in the GPU memory, no buffer copies, no delay on the other side. But I need a smart element that sits next to the GPU on the server that will take data that's going to come completely out of order, but place it in the right order in the GPU memory. And that's the purpose of the SuperNIC. And that's why when you build an infrastructure for distributed computing workloads, you need to have a switch element that does the distribution unconditionally. And then you need a SuperNIC that will put the data back in order. That's why it's an infrastructure, and it's not a single device.
I think that's a perfect segue to kind of expand this conversation about out of order and packet spraying type concepts and talk about maybe if you could just talk about MRC and the recent announcement that you made with your consortium partners as well as maybe contrast that with some of the goals that the Ultra Ethernet consortium is going -- because it sounds to a layman like myself and I would assume most in the room, a lot of what Ultra Ethernet consortium is attempting to do is solve for that problem.
Yes. There is more and more focus on AI workloads as every data center is going to be accelerated and AI is going everywhere. So obviously, there is a good attention on it. What we did in Spectrum-X Ethernet is 2 things. One of them is we brought a lot of learning from InfiniBand to Ethernet. Lossless. The reason that we want -- we prefer lossless is because we don't want to drop packets because of congestion because once you drop packet, you need to retransmit it, and that means jitter, that means extra delay. So we don't want to drop packets and we're focusing on lossless. Focusing on RDMA. And by the way, the other protocols that you mentioned also based on RDMA. ROCE is just RDMA over Ethernet. If you say RDMA and ROCE actually, you said the same thing, okay, twice.
ATM machine.
Yes. MRC is also or RDMA or ROCE, for example, based and so forth. We also brought adaptive routing into the infrastructure that has been done in hardware because we actually want the decisions on the different paths to be done very, very quickly immediately. So we brought all those things into Spectrum-X. We also enable in Spectrum-X a flexibility to support other routing protocols on top of that. MRC is an example for that. MRC is another way or another algorithm to how to distribute the traffic across the network. And Spectrum-X does not support just one protocol. Spectrum-X actually supports multiple protocols on that infrastructure. It supports adaptive RDMA protocol. It supports MRC protocol on top of that. And I can tell you, it supports other customized protocols that our other customers or large customers have developed and are using.
So there is a variety of routing protocols that can run on top of Spectrum-X, and there are optimized as an entire infrastructure on the end-to-end side because, again, for any protocol, you need 2 elements at least. There is an element on the SuperNIC and there is an element on a switch. A lot of things, by the way, that we built into Spectrum-X, those are the things that were discussed later on in other consortium like you mentioned, and there is always -- there is also other groups of companies that working on more algorithms and so forth.
As we have customers that build very large infrastructures, very large AI factories, those are expensive AI factories and they would like to optimize their infrastructure to the way they run their own workloads. So that's why we brought the ability in Spectrum-X to support different kind of routing protocols to do that in a zero-jitter approach, and it could be the adaptive RDMA, MRC and several others.
Just briefly clarify when we talk. So would it be fair to compare MRC to, for example, BGP as that is another routing protocol that could be built on top of Spectrum.
Yes, there is -- exactly. And in Spectrum-X, first in Spectrum-X is what was important from us -- for us to use all the standard protocols that exist in Ethernet. The way that we implement that, that was done differently in order to eliminate jitter and so forth. MRC is another way to route packets, for example, you mentioned other protocols, yes, there is multiple protocols that you can use. There is ways to implement that in a way that to eliminate jitter, have zero jitter, which is the key element. And all of those options are supported on Spectrum-X Ethernet.
I'll maybe steal one more geeky question. And that is, if you could just talk about how the networking problem changes moving from large-scale pretraining to maybe a multi-tenant inference workload type? Does the -- and maybe how are your customers thinking about provisioning fungibility across those 2 deployments? Is there may be an overprovision of a back-end network in the eventual inference because it gives you flexibility to scale up and down, not scale up in the networking sense, but scale up and down...
Yes. So there is a lot of commonality between actually pretraining and inferencing. Both are distributed computing workloads. Both require zero jitter. Now when we say zero jitter, zero jitter means that if you're running a single workload, that workload will not impose different delays on different communications to different GPUs because that's going to be a nightmare, okay, from a performance perspective. But it's the same thing when you run multiple workloads on the same infrastructure, like a cloud, like AI cloud, one of the key problems in traditional clouds and off-the-shelf Ethernet was used, is as jitter was not a thing that was a focus. What happened is that one workload could impose performance issues on another work. One workload can create delays in the network that will impact another workload that shares the same infrastructure.
And one of the common best practices in traditional cloud was never have 2 different users run on the same switch because one will impact the other. And then your SLA is gone out of the window. So there was a heavy focus on how do I schedule different jobs in a traditional cloud that one job will not be in the same switch as another because that will negatively impact the performance on another job. Once you deal with jitter, once you eliminate jitter, it means that there is no traffic that will create congestion in the infrastructure.
And if there is no traffic that will create congestion in the infrastructure, there is no way for one workload impact another workload. So it doesn't really matter if those are 2 training workloads running on a same infrastructure or it's 100 inferencing workload that runs on the same infrastructure. You need the same solution for both. So what we brought in Spectrum-X for training. That was the first workload that we're running, it's so amazing now when you have inferencing, you can actually see the difference in that.
Now inferencing does enable or -- the need to create more infrastructures. And recently, we announced a new storage infrastructure for context, for memory context for inferencing because now as we move to the world of agentic AI, there's AI talks with AI, there is much more data that you need to hold. There's more -- there's larger sizes of KV cache. Not everything can be stored in a local server and a GPU server, and you need to go to an outside storage and outside storage it exists is network storage and network storage is great for a variety of workload, but it's not really optimized for inferencing because network storage was built to make sure that the data is not going to get lost.
So I'm going to invest in replicas of the data and making sure that if an SSD went down, I still have replicas in others, it's too expensive if you look on inferencing because in inferencing, for the rare cases that something is going to happen to an SSD, I can actually recalculate the data. So instead of investing in replicas and so forth, I can build something that is going to be much more effective and optimized for inferencing.
And that's what we did with BlueField and STX and CMX and creating a new storage infrastructure for inferencing for KV cache. So what we built for training works greatly for inferencing actually, but inferencing created or drove the creation of more infrastructures as part of the big AI factory.
All right. I'm going to ask one that Sean is going to have to deal with the answer to. So it seems like the debate on CPO has shifted from scale out to scale up more recently. What's your view of what CPO can bring to both of these domains and what sort of a reasonable time frame at which we should expect CPO adoption more broadly in your compute ecosystem.
Yes. And I'll combine 2 answers if it's okay. You combine 2 questions, I'll combine 2 answers. There is also -- I heard that there is a debate between copper versus CPO, copper versus optics. It's actually a funny debate. It's like you're going to ask, how do you look at an airplane versus a car? If I need to drive to the next city, if I need to drive to New Jersey, I'm going to take a car. If I need to fly to Taiwan, which I have a flight tonight, I need to take an airplane, right? There is no way I can use a car. The same thing goes to copper versus optics, okay? If I can use copper, which means is that the distance that I need to cover is applicable for copper. I'm going to use copper because optics will be too expensive for that.
If I need to go to New Jersey, I'm not going to take an airplane, I can fly from New York to JFK, right? For example, I can do that with an airplane. Yes. But why, right? So copper consumes 0 power. It's very cost-effective, it's very reliable. The problem is short distance. But if that distance is okay for where I am designing, I'm going to use copper. If the distance is not applicable and copper cannot cover, the needed distance, I'm going to use optics. That's simple. Now in the optical world, in optical connectivity, there is different ways to connect optics. There is different kind of transceivers and so forth. But optics in order to cover distances require to use active devices. They require to use different kind of light sources and DSPs and optical engines and so forth, all of those consume energy. And we live in a world today that power is the #1 limit of AI factories of the compute capacity they can build an AI factory, right? That's my limiting factor.
And of course, I want to try and optimize power consumption. I want to reduce power consumption wherever I can in order to be able to bring more compute because this is how I'm limited. And since optical connectivity is more and more used, scale-out requires optical connections because of distance and it consume more and more power. Scale up domain, if that scale up domain is within the rack, I'm going to use a car. I'm going to use copper. That scale-up domain is start to have multiple racks, I need to use optics. So when we talked about, for example, connecting 1,152 GPUs with the Feynman platform, we'll also mention, hey, that will also use co-packaged optics so optics in order to run the distance. Now if I'm using optics, in optics on scale-out infrastructure today can get close to almost 10% of the compute capacity on power perspective. That's a big number.
Co-packaged optics is a technology that enables to minimize the power consumption that is going to done or run or used on the optical network. And that's why we went to Co-packaged optics. That's why investing in Co-packaged optics because if I need to go to distances, I need optics, if I'm using optics, I want to have the latest, the best technology that consumes the least amount of power and that's called Co-packaged optics, regardless if it scale out, scale up, scale across, it depends on the distance.
All right. Well, unfortunately, we're out of time. I think we could have sat up here for another hour, but Gilad, we really appreciate you joining us and providing all of your insight. I mean it's a privilege to get a front row seat to see what the innovation you and your team is driving and good luck.
Thank you very much.
Thank you, Gilad.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — TD Cowen's 54th Annual Technology
NVIDIA — TD Cowen's 54th Annual Technology
NVIDIA stellt Networking als Kernbestandteil seiner „AI Factory“ heraus: Spectrum‑X, NVLink, BlueField und SuperNIC bilden zusammen die Infrastruktur-Strategie für skalierbares Training und Inference.
🎯 Kernbotschaft
NVIDIA argumentiert, dass moderne Rechenzentren als „eine Einheit“ entworfen werden müssen: enge Co‑Designs aus GPU, Netzwerk, Storage und SmartNICs (SuperNIC) schaffen niedrige Latenz und null Jitter. Ethernet (Spectrum‑X) wird AI‑tauglich gemacht, Mellanox‑Technologie ist vollständig integriert und offene Schnittstellen sollen Partnerwahl ermöglichen.
⚡ Strategische Highlights
- NVLink & AI‑Factory: Vertikales Co‑Design für skalierbares Training — NVLink für Scale‑Up, InfiniBand und Spectrum‑X für Scale‑Out.
- Spectrum‑X: Ethernet speziell für AI mit lossless/RDMA, adaptivem Routing und Unterstützung mehrerer Protokolle (z.B. MRC) in Kombination mit einer SuperNIC.
- BlueField & Storage: BlueField als DPU/Storage‑Accelerator für KV‑Cache und inference‑optimierte Netzwerkspeicher statt teurer Replikation.
🔭 Neue Informationen
Konkreter als in Earnings: Spectrum‑X unterstützt explizit mehrere Routing‑Algorithmen (inkl. MRC) und setzt auf eine Switch+SuperNIC‑Architektur (Packet‑spraying + Re‑ordering). NVLink Fusion wird als offenes Ökosystem beworben; BlueField‑basierte Lösungen für inference‑optimierten Speicher wurden hervorgehoben.
❓ Fragen der Analysten
- Mellanox‑Deal: Integration war nahtlos—Mellanox als „ein Team“ hat NVIDIA zur Computing‑Plattform gemacht.
- Rack‑Vertrieb: Diskussion zu integrierten NVL‑Racks vs. Partner‑Optionen: Management betont Offenheit und modulare Nutzung einzelner Bausteine.
- Optik vs. Kupfer/CPO: Co‑packaged Optics (CPO) wird als nötig für Reichweite und Energieeffizienz beschrieben; Kupfer bleibt für kurze Distanzen sinnvoll.
⚡ Bottom Line
Netzwerk wird bei NVIDIA zur strategischen Wettbewerbsgrenze: technische Differenzierer (Spectrum‑X, SuperNIC, BlueField) erhöhen die Markteintrittsbarrieren und erweitern die Nachfragebasis über reine GPUs hinaus. Anleger sollten auf Umsatzmix (Netzwerk vs. GPU), Partnerakzeptanz der integrierten Racks und Energieeffizienz‑Trends (CPO) achten.
NVIDIA — Q1 2027 Earnings Call
1. Management Discussion
Good afternoon. My name is Sarah, and I will be your conference operator today. At this time, I would like to welcome everyone to NVIDIA's First Quarter Earnings Call. [Operator Instructions]
Thank you. Toshiya Hari, you may begin your conference.
Thank you, and good afternoon, everyone. Welcome to NVIDIA's conference call for the First Quarter of Fiscal 2027. With me today from NVIDIA are Jensen Huang, President and Chief Executive Officer; and Colette Kress, Executive Vice President and Chief Financial Officer. Our call is being webcast live on NVIDIA's Investor Relations website. The webcast will be available for replay until the conference call to discuss our financial results for the second quarter of fiscal 2027.
The content of today's call is NVIDIA's property. It can't be reproduced or transcribed without our prior written consent. During this call, we may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties and our actual results may differ materially. For a discussion of factors that could affect our future financial results and business, please refer to the disclosure in today's earnings release, our most recent Forms 10-K and 10-Q and the reports that we may file on Form 8-K with the Securities and Exchange Commission. All our statements are made as of today, May 20, 2026, based on information currently available to us. Except as required by law, we assume no obligation to update any such statements.
During this call, we will discuss non-GAAP financial measures. You can find a reconciliation of these non-GAAP financial measures to GAAP financial measures in our CFO commentary, which is posted on our website.
With that, let me turn the call over to Colette.
Thank you, Toshiya. We delivered an exceptional quarter with revenue, operating income and free cash flow exceeding our prior records. Total revenue of $82 billion was up 85% year-over-year and 20% sequentially. This marked our third consecutive quarter of year-over-year acceleration and the 14th straight quarter of sequential growth, a significant feat given the sheer size and complexity of our manufacturing operations.
The $13.5 billion sequential revenue increase was also a record. We capitalized on the inflection in inference demand by ramping Blackwell systems across our diverse end customer base. from hyperscalers to motto makers to AI cloud providers and sovereign customers. In Q1, we also allocated capital effectively across R&D, investments in our ecosystem and share repurchases. We returned a record $20 billion to our shareholders while executing strategic investments, both upstream supply chain, and downstream go-to-market ecosystem. This is critical to the market's development and our long-term position.
Data center revenue of $75 billion was up 92% year-over-year and 21% sequentially, driven by sustained strength in our Blackwell architecture and demand for GB300 NVL72 was particularly strong with Frontier motto builders and hyperscalers each having cumulatively deployed hundreds and thousands of Blackwell GPUs, marking the fastest product ramp in our company's history. Grace Blackwell is the fastest training system as well as the lowest token generation cost at Inference. Spectrum-X, our end-to-end Ethernet platform purpose-built for AI is now larger than all Ethernet network peers combined. InfiniBand has also had a very strong quarter, growing more than 4x year-over-year, driven by deployments of our next-generation XDR technology.
For your models, data center computing revenue of $60 billion was up 77% year-over-year while data center networking revenue of $15 billion, nearly tripled year-over-year.
Before we dive into data center, we'd like to brief you on our transition to a new reporting framework that better reflects our current and future growth drivers. We have 2 market platforms, data center and edge computing. Within data center, we will report 2 submarkets hyperscale and ACIE, which incorporates AI clouds, industrial and enterprise. Hyperscale will include revenue from the public cloud and the world's largest consumer Internet companies, while ACIE addresses our growth opportunities in diverse AI purpose-built data centers and AI factories across industries and countries. Edge computing highlights devices for agent and physical AI, including PCs, gaming consoles, workstations, AI RAN base stations, robotics, and automotive. For your reference, we have posted on our website a revenue breakdown based on our new platforms for the past 9 quarters.
Moving back to our data center results. Hyperscale revenue of $38 billion was approximately 50% of data center revenue and increased 12% quarter-over-quarter. ACIE revenue was $37 billion and grew 31% quarter-over-quarter, including AI cloud revenue that more than tripled year-over-year. Our customers have enabled rapid stand-up of AI compute capacity. The number of partner [indiscernible] exceeding 10 megawatts has nearly doubled in just 1 year, now surpassing 80 sites. Sovereign revenue increased more than 80% year-over-year. NVIDIA AI infrastructure is now deployed across nearly 40 countries, representing $50 trillion in GDP. As evident to our Q1 results, our customer base is diverse and growing, supported by our best ecosystem and installed base, breadth of CUDA accelerated application and the lowest token cost provider. We are well positioned to address a market opportunity that far exceeds that of any other AI computing platform.
Demand for AI infrastructure continues to expand at an unprecedented pace. The build-out of AI factories is accelerating the value of NVIDIA AI infrastructure is rising. The price of renting an [ H-100 ] has risen 20% year-to-date, while A100 cloud pricing is up nearly 15%. Benefiting from the versatility of our platform and continuous performance enhancements enhanced by our software stack, customers are generating profitable revenue beyond the depreciable life of their GPUs. The vast and trusted marketplace for NVIDIA compute is a critical foundation on which billions and AI infrastructure spending is being financed by the ecosystem.
There are 2 primary drivers behind the accelerating build-out of AI infrastructure. First, from search and advertising to recommender systems and content understanding. The largest hyperscale workloads continue to transition from CPU to GPU-based accelerating computing. Second, the adoption of products and services native to AI is inflecting. Since the advent of ChatGPT, we have witnessed mainstream AI transition from one-shot inference to reasoning and to now agent AI is no longer a nice to have. AI is now a necessity for enhancing productivity across all industries and roles. This is propelling revenue acceleration across all layers of the AI cake, including energy, chips, infrastructure, models and applications. Growth in the model layer, particularly at Anthropic and OpenAI has been incredible with momentum continuing to accelerate, including breakout growth in OpenAIs Codex since the launch of GPT 5.5.
With analysts now forecasting hyperscale CapEx to exceed $1 trillion in 2027 and Agentic-AI beginning to proliferate all industries AI infrastructure spending is on track to reach $3 billion to $4 trillion annually by the end of this decade. Our Blackwell architecture is everywhere, adopted and deployed by every major hyperscaler, every cloud provider and every major model maker. Last month, we celebrated OpenAI's launch of GPT 5.5 codesigned for trained with and served on Blackwell, currently positioned at the top of artificial analysis leaderboards. Microsoft [ Fair water ], the world's most powerful AI data center is now live, ahead of scheduled powered by hundreds of thousands of Blackwell GPUs.
Starting this year, AWS will add more than 1 million Blackwell and Rubin GPUs and are collaborating on Spectrum Networking. At Google, Blackwell will be offered to customers in the cloud, including confidential computing capability, a new foundation for secure high-performance AI. Our share of front tier AI compute is increasing. We have deepened our collaboration with Anthropic and are delighted to a strategic partner to expand their compute capacity. We will support the company's growth trajectory through AWS, Azure, [ Core ], Space-X AI and more.
Now with the addition of Anthoropic-2, OpenAI, Gemini, [ SpaceXXAI ], Meta, MSL, Microsoft AI, TML, Reflection, Complexity, Cursor, and other major frontier labs already building on NVIDIA. Our share of front Tier A models will grow significantly. Today's data centers, are revenue-generating AI factories constrained by power and capital AI factory operators must choose the right architecture. With our extreme codesign approach, we deliver the industry's lowest token cost, the highest token throughput and the highest ROI. [ MLPerf ] inference results are in. And once again, we swept every benchmark as Blackwell Ultra delivered the highest throughput across the broad set of models and deployment scenarios. Full stack innovations drove the 2.7x increase in throughput and a 60% reduction in the cost per token on GB300 compared to just 6 months ago.
NVIDIA compute is not just the highest performance AI infrastructure. It is the most economic and financeable. Customers do not buy GPUs. They build AI factories and the right economic metric is not the purchase price of the GPU. It is the life top cost of an AI factory producing intelligence, token [ Berat ], tokens per dollar, uptime, utilization, time to production, software durability and asset light. NVIDIA excels at all of them. Agentic-AI and reinforcement learning represents new growth opportunities for CPUs. Building on the success of our [ Brace ] CPU, Vera is arriving just in time to meet this inflection.
Built on custom arm [indiscernible] and codesigned end-to-end with Rubin GPUs and NVLink, Vera will deliver up to 1.5x faster performance per core, 2x performance per watt and 4x density per rack compared to x86-based alternatives. Vera CPU opens a brand-new 200 billion [ down ] for NVIDIA a market we have never addressed before. And every major hyperscale and system maker is partnering with us to get it deployed. We have visibility to nearly $20 billion in total CPU revenue this year, setting us up to become the world's leading CPU supplier.
Our annual product patients, a pace that is unmatched remains a key pillar supporting our market position. We are on track to commence production shipments of Vera Rubin in the second half of this year starting in Q3 by integrating 7 purpose-built chips across 5 accelerated racks, Vera Rubin will deliver up to 35x higher inference throughput and up to 10x greater AI factory revenue compared with Blackwell. As an early adopter, Google's, [ A5X bare metal ] instances, which can support up to 960,000 Rubin GPUs across multiple sites can enable customers to run their largest AI workloads on NVIDIA's optimized infrastructure. While the U.S. government has approved licenses for H200 to be shipped to China-based customers, we have yet to generate any revenue, and we are uncertain whether any imports will be allowed into the country. As a result, Consistent with last quarter, we are not including any China data center compute revenue and our outlook.
Let me move to edge computing. Our edge computing market platform generated $6.4 billion, up 10% quarter-over-quarter and 29% year-over-year. Robust Blackwell workstation demand was a strong contributor to the growth while consumer demand fell modestly due to higher memory and system prices. Our physical AI continues to gain momentum, exceeding $9 billion in revenue over the last 12 months. Our partnership with Uber will power the Robotaxi fleet across nearly 30 cities and 4 continents by 2028. And in robotics, leading companies across a range of industrial, surgical and humanoid applications are building on NVIDIA's technology to develop and deploy at scale. We remain front-footed in securing sufficient supply to support our customers' growth.
In Q1, we increased codex supply inclusive of inventory purchase commitments on prepaid to $145 billion. While we are not immune to supply challenges, we remain confident in our ability to support the growth opportunity ahead with our intense focus, scale and long-standing partnerships with critical suppliers continuing to serve us well.
Let me move to the rest of the P&L. GAAP gross margin was 74.9% and non-GAAP gross margin was 75%, largely flat sequentially by Blackwell Systems continued to account for most of our shipments. GAAP and non-GAAP operating expenses were up 12% sequentially, primarily due to higher compensation and an increase in compute and infrastructure costs. Our non-GAAP effective tax rate of 16% came just below our prior outlook due to favorable geographic mix. And on our balance sheet, days sales outstanding was 45 days due to favorable timing of collections, we expect to return to the mid-50s in Q2. We generated record free cash flow, $49 billion, up from $35 billion in Q4.
I'd now like to update you on our capital allocation plan. First, to reiterate. Our intention is to prioritize R&D and strategic investments both will enable us to cultivate our ecosystem, drive market growth and strengthen our market position. As a key enabler of AI, we will make investments necessary to deliver the industry's lowest cost per token and the highest token throughput, which will help our customers and partner scale and expand the AI Frontier. Return program is another key component of our capital allocation strategy, giving confidence in our long-term free cash flow outlook and our commitment to sharing our success with shareholders we are increasing our quarterly dividend from $0.01 to $0.20 per share. We plan to review our dividend on a regular basis as we continue to scale our business. We are also announcing an $80 billion share repurchase authorization, which is in addition to the $39 billion remaining on our current plan. As we indicated at GTC, we plan to return roughly 50% of free cash flow to shareholders this year.
Let me turn to the outlook for the second quarter. Total revenue is expected to be $91 billion, plus or minus 2%. We expect sequential growth to be driven primarily by data center. We are continuing to work vigorously on our supply chain ecosystem to address the incredible demand we see ahead of us, giving us full confidence in the $1 trillion in Blackwell and Rubin revenue we foresee from 2025 through calendar 2027.
GAAP and non-GAAP gross margins are expected to be 74.9% and 75%, respectively, plus or minus 50 basis points. For the full year, we are still expecting to be in the mid-70s. GAAP and non-GAAP operating expenses are expected to be approximately $8.5 billion and $8.3 billion, respectively. For the full year, we now expect OpEx growth to grow somewhere in the upper 40s on a year-over-year basis, driven by higher R&D and acceleration in the usage of AI tools to enhance productivity. For the full year 2027, we expect GAAP and non-GAAP tax rates to be between 16% and 18%, excluding any discrete items from material changes to our tax environment. This is lower than our prior expectation of [ '17 to '19 ] due to changes in geographic mix.
That puts me at the end of this part. And I'm going to now turn this over to the Q&A with Toshiya.
Thanks, Colette. We will now transition to Q&A. Operator, please poll for questions.
[Operator Instructions] Your first question comes from Joseph Moore with Morgan Stanley.
2. Question Answer
I guess I'd like to ask what drove the change in segmentation? What's the philosophy behind giving us the numbers that way? And then can you talk about any competitive differences between the 2 segments and this kind of surprising CPU number that you talked about, how do you see that across the 2 segments as well?
Yes. Thanks, Joe. First of all, Colette meant to say we're increasing our quarterly dividend from $0.01 to $0.25. I think that extra $0.05 would mean a lot to the large shareholders.
So anyhow, let's see, Joe, on the segmentation and the description of the business. We wanted you to understand our business better. AI is very diverse and computing is diverse. They're diverse in several ways. The first thing, of course, is AI includes languages and depending on the different industries, it could be 3D graphics for manufacturing and industrial robotics. It could be proteins for life sciences. It could be small chemicals or life sciences or material sciences. It could be physics for the physical sciences, whether it's in the energy sector or, of course, the science labs, higher education, so on and so forth. So AI is diverse.
The second thing is the applications are diverse. It could be an enterprise, it could be in the energy sector, manufacturing sector and such. Where it runs is diverse. It could be in the hyperscale cloud. It could be AI natives, these a whole network of AI natives that are popping up around the world. Enterprises on-prem, industrial in the factories, in the plants, all the way to supercomputing centers and the edge. Edge, including, of course, what most people see, self-driving cars, robotics, but a large growing network of computers inside manufacturing plants whether it's a chip plant or packaging or computer plants, all kinds of different types of manufacturing plants. And then, of course, in the future, every single base station, every single radio network would become an AI-powered radio network. And so where it runs.
And then lastly, how it's governed. It could be operated by public cloud, but it could also have industrial regulatory reasons that prevents it from being run in a regulatory cloud. It could be because of confidential computing. It could be because of national security reasons, different data centers have to be built differently. NVIDIA is quite unique in the sense that we are the only company that builds all of the technology components. We build it in an extreme co-design way in a complete end-to-end way and a full stack way. But then we, of course, open the platform so that it could be integrated into all the different environments. But some environments just require -- an enterprise, for example, require a company who has all of the technologies working together so that they don't have to build it. They would like to buy it and operate it. And so there's many different segments of the data center market, where NVIDIA's total solution, fully integrated solution with full stack, but still open, that way of doing -- of producing or delivering products is really, really important.
And so if you look at our different segments, the way we broke it out into 3 large segments. You take all of the words that I just said, and you try to find the simplest factoring of it. It would be the hyperscale clouds that would be one large segment. And within that segment, there's 3 different ways that we operate. First way is that we help the hyperscale clouds, accelerate their data processing and machine learning workloads. We accelerate and support their AI processing inside. We also, of course, bring a lot of business NVIDIA ecosystem business to their public clouds. And so that's one segment.
The second segment is AI natives, enterprise on-prems, industrial on-prems in that and sovereign AI. That segment is growing incredibly fast because everybody needs AI, and we're going to see AI being adopted by every industry, every country, every company. And so everybody wants to build it in a different way. And the fact that we provide the entire solution, it makes it much easier, makes it possible at all for people to be able to build these things. And then, of course, the robotic edge.
Today, yesterday's computing was largely about personal computing. In the future, it's going to be about personal AI and that personal AI, one example of it is the self-driving car. It's a car, it's a robotic system that's essentially your personal AI. And of course, there'll be all kinds of different types of robotic systems including even the base station radio network, as I mentioned, is going to be essentially a robotic system.
So that's the reason why we broke it all apart this way. It's the simplest way of understanding our business. Each one of them have different stacks in a lot of ways. They have different operating systems. They operate in a different way. We go to market very differently in each one of them. The easiest go-to market, of course, is the hyperscaler because they're only 5 or 6 of them. But the rest of the industry represents a couple of 250,000 companies around the world. That go-to-market is very complex, very diverse. Your understanding of AI has to be extremely diverse. And as you know, NVIDIA has a large -- the largest suite of acceleration libraries in the world from computational lithography to fluid dynamics to particle physics, the molecular dynamics to the list goes on. And all of those libraries are essential for us to engage the vertical industries that represents the second and the third category, okay?
So anyways, it's really about the fact that our business has now evolved and grown to such a large scale it's helpful to segment it so that you have a better understanding of how our business works.
Your next question comes from Ben Reitzes with Melius Research.
I wanted to ask Jensen, I wanted to ask you about your philosophy on growth. Your data center business ex China grew about 120% in the quarter and then you're guiding about [ 100% ]. CapEx at the hyperscalers is forecast by many, including myself, to like grow 90% to 100% this year. And you talked about data center still on track to be $3 trillion to $4 trillion by the end of the decade. I was just wondering the goal for the company to grow faster than hyperscaler CapEx, Do you still -- are you comfortable in kind of endorsing that view? And do you still see hyperscaler CapEx kind of still growing after this year at a very rapid clip?
Yes. Thanks, Ben. So first of all, we should be growing faster than hyperscale CapEx. And the reason for that is illustrated by the segmentation that I just described. Our data center business has 2 large parts. It has more parts than that, but we combined it into 2 large parts for simplicity's sake. It's much more complex than the 2 large parts but I combined it into 2, so that it's at least easier to understand, okay?
And so if you look at the first part, it's hyperscalers. That's the hyperscale CapEx that you were just talking about. And there are $1 trillion this year. I -- very expectation is going to grow from here for fundamentally good reasons. This is the way computing is going to work in the future. And if they don't have the compute, they won't have the revenues. It is very clear, compute is revenues, compute is profit. And so the world is changing. Software didn't use SaaS, it didn't use to use as much compute, but AI requires a tremendous amount of compute. But you could do of course, incredibly more, which is the reason why we heard about the AI frontier AI companies, both Anthropic and OpenAI growing at an incredible pace. The fact that they can grow within 1 month, what some of the SaaS companies would have taken a decade to grow tells you something. And so the first category is hyperscale and the CapEx is at $1 trillion and it's growing towards the [ $3 trillion to $4 trillion ].
The second category, the second category is all of the AI native clouds. They're regional. They're all over the place. their start-ups all over the world, supporting those companies. They're enterprise, 250,000 enterprise companies around the world, many of them will have to bill or want to build AI factories for themselves to operate. Many industrial companies, there's no choice but to put the computer where the context is, where the action is, you can put that in the cloud. It has to respond reliably quickly every single time, can imagine a chip plant, a chip fab being connected to a cloud service provider doesn't make any sense. And so the second category and the sovereign AI clouds. And so there's a whole category of data centers that semi-custom chips just don't apply because these data centers want to buy systems, they want to operate systems, they don't want to design, they don't want to build it themselves.
And so the second category is extremely diverse. Instead of 5 or 6, 7 companies representing the revenues associated with our first category, the second category is hundreds, thousands of companies and in the future, will be hundreds of thousands of companies with a large number of companies with smaller installations. And that category is going to continue to grow at incredible pace. This second category, when I talk about physical AI, and I talk about how the rest of the $100 trillion industry that has not been affected by -- impacted by IT in the last 30 years. It's about to be impacted by AI. That is the segment that I'm talking about. The second cluster is growing incredibly fast.
Our share of that, of course, is very, very large. We're fairly unique in our ability to be able to serve this industry. Our platform is built like it's vertically integrated so that everything works but when -- then we disassemble it, so that people could build and bite in the configuration they want and assembly the way they like. And so this second category is fairly portly understood because there are just so many small companies or so many companies and each one of the installations are relatively small compared to, of course, one of the hyperscalers.
And so if you look at the segmentation and the size of each, you could see that, in fact, we're growing share in the hyperscalers because we now have much bigger support from Anthropic, a new partner of ours, and we're helping them expand their capacity greatly in the coming years. And then the second very few companies have exposure into the second category because of the platform solution that we have.
Your next question comes from C.J. Muse with Cantor Fitzgerald.
You have Vera Rubin coming soon, and you obviously have great insight into coming updates to frontier models, new techniques to optimize around diverse AI workloads with interest keenly focused on your market share and inference. How do you see Vera Rubin in your [ Extreme co-engineering ] impacting your share of the inference market as we look into late '26, '27?
Well, we are growing share in inference, and we're growing share in inference very, very quickly. And the reason for that is this year, the number of frontier model companies grew. And so there's Cursor and [ Complexity ] and there's some new model companies, TML and Reflection and the list goes on. And so the number of frontier model companies has grown, and we added Anthropic to our partnership this year. They're expanding incredibly fast. We've partnered with them to secure computing capacity across Azure, AWS, [ Core weave ], I forget who else we've already announced, but there's a whole list of others that we are bringing online for them. And so the amount of capacity that we're going to bring online for Anthropic this year and next year is going to be quite significant, very significant.
And so we're growing -- and our coverage of anthropic has been largely 0 until just recently. And so we're gaining share tremendously in inference. Vera Rubin's is going to be even more successful than Grace Blackwell at this point. every single, I can't think of one. Every single frontier model company will jump on Vera Rubin from the get-go, and that wasn't true before on Blackwell. And so Vera Rubin is off to a tremendous start and will surely be more successful than even Grace Blackwell.
So I think the end of your answer, C.J., is that we're gaining share in inference. Let me go back again to the question that Ben was asking. Remember, so far, everything that I've just explained in the inference question is really focused on hyperscale Remember, there's a whole second category of AI data centers that we serve almost uniquely. Now this segment is very fragmented. It requires a fairly -- really well-integrated platform solution and a very large go-to-market. And that segment, all of the inference, 100% of that -- the vast majority of that is NVIDIA. And then, of course, physical AI. NVIDIA is practically the only company serving physical AI today. And we've been working on physical AI for a long time. And so that is also growing. So our share of inference is growing very quickly.
Your next question comes from Timothy Arcuri with UBS.
Jensen, I wanted to ask about the traction you're getting with some of these custom merchant things you're doing, stuff like CPX and LPX. And I just wanted to ask and see sort of you've talked before about [indiscernible] being, I think, 20% of the market. So I would imagine you're getting pretty good traction with LPX. So can you just talk about that and maybe also how that fits into your broader platform strategy.
The LPX is designed for low latency and high token rate. But its throughput is low, its throughput is low. It's model size capacity is low. And it's context processing, its ability to absorb a lot of context, for example, for software coding, for agentic workloads, its ability to absorb a great deal of context is lower. And so the -- so the challenge is simply, and I've explained before that the use case for LPX is not broad. It's intended for somebody who has a fairly large portfolio of different types of token services. And for the high token rate, maybe these services are quite premium and the number of customers is not significant, but the token rate is very high. And so that remains exactly consistent with what I've said before, and I still expect that.
And so -- so I expect that LPX and other SRAM-based decode focused -- high token rate generated focused accelerators will always be -- would be a niche product for some time to come. As you know, Grace Blackwell and Vera Rubin, we support the entire life cycle of AI from the data processing, preparing for training okay, data processing to pretraining to post training, reinforcement learning, all the way to inference. Grace Blackwell is the best platform in the world to do all of that. And if we -- if in certain circumstances, so long as the customer -- the provider already has a high token rate service that they can offer, then we can tack on an LPX and they could deliver that service even better. And so that's how I see the market.
And I think whether it's 20% or 10% just depends on where we are in the development of AI. I think today, it's a lot less than 20%. Some day, these premium tokens could be 20% and we're ready to work with service providers to enable this capability. I'm excited about it.
Your next question comes from Vivek Arya with Bank of America Securities.
Jensen, there's a lot of excitement around CPU for Agentic applications and just a lot of noise around the number of CPUs actually exceeding the number of GPUs. And I was just hoping that you could kind of give your perspective that, first of all, is this an incremental workload? Is this kind of cannibalizing what the GPU would have done otherwise? And then secondly, the $20 billion number that you gave, is that for stand-alone Vera CPUs? Or is that kind of already included in that Vera as part of Vera Rubin? So just if you could educate us on the role of CPU versus GPU is it cannibalistic? Is it incremental? And then the $20 billion number, how to kind of put that in context with what you sell, right, which is usually the CPU as part of the GPU.
The $20 billion is for stand-alone CPU. And remember, we have Vera is used in 3 ways. As a stand-alone -- 4 ways -- let me just start with the one that you already know. The first way is Vera Rubin. And we'll sell millions of Rubins, and every 2 of them is connected to a Vera. And of course, we price those 2 and they're properly priced. And so that's #1 use case.
The second use case is Vera standalone CPU. The third is Vera with [ CX-9 ] and the software stack for storage. And then Vera in a [ CX-9 ] -- with a software stack for security and compute isolation and confidential computing. Okay, so each one of those use cases is built on Vera. And my sense is that we'll be supply constrained throughout the entire life of Vera Rubin. There are 4 different use cases of it. And -- but anyhow, the answer to your question is of the $20 billion is a stand-alone.
With respect to CPUs, an agent is essentially what people call a harness. The agent has a harness that does the -- and the harness could be open cloud, it could be Hermes, code -- Cloud Code is essentially a harness around Cloud around the OPUS model. OpenAIs Codex is a harness around the GPT 5.5 model. And so these are harnesses. And these harnesses provide for things like IO, orchestration, memory management tool use connected to tools, for example, browsers and things like that, see compilers, python compilers. And so the harness runs on CPU. And the tool use runs on CPUs. So for example, if the AI were to do a search or do a browser, use a browser that would run on the CPU.
The world has 1 billion users, human users. My sense is that the world is going to have billions of agents. Not today, I mean, we're going to grow into it but we'll have billions of agents. And those billions of agents will all use tools. And those tools that can be like PCs, just like us humans using PCs today. In the future, you'll have an agent using PC and so if you kind of think along the lines of in the future, you pick your favorite number of agents at the moment, at the moment, call it, a few hundred thousand, but in the future, call it, eventually a few billion. I could imagine them all using the effectively having PCs that they can all use. And so -- but the large length, every one of those agents are going to spin off subagents. And every time they spin these off, you're going to need to do inference. That's where the thinking happens. All of the thinking happens on GPUs, all of the orchestration essentially runs on CPUs.
And the subagents when they're spun off, they -- when they're thinking they use GPUs. Whenever the agents use simulators, those can run on CPUs or GPUs, which is the reason why we're working so closely with Cadence and Synopsys and to accelerate all of the world's tool we're accelerating all of the world's tools and data processing engines and database engines because agents use these tools and have -- they have lower patients tolerance humans, and they want things to happen quickly. And so we're accelerating all the world's tools so that it runs on CUDA. And you could see us doing that when I work with Cadence and Synopsys, and Siemens and companies in Adobe. And that's because we're trying to get all of the world's tools to run on GPUs because they already have GPUs, and it's a lot faster.
So we're going to need a lot more CPUs, and Vera was designed to be an agenetic CPU. The CPUs of the past were designed to have many cores so that it could be easily rentable. People rent at cores. Well, agents don't rent cores. They just want the work to be done fast. The economics of the past was dollars per core. That's the economics of cloud computing of the past. The economics of the AI of the future is tokens per dollar or dollars per token. And so what we need to do in the future is to generate tokens, process tokens as fast as possible, and that's what Vera does incredibly well.
So we're expecting to be very successful with Vera. But ultimately, ultimately, what we're doing is we're building infrastructure for AI and it needs incredibly great storage. That's the reason why we built STX. It needs incredibly good networking. That's why we have Spectrum-X. It needs incredibly great GPUs, of course, and inferencing ability. That's the reason why NVLink 72. It needs incredibly great security and confidential computing, which is the reason why Vera Rubin is the world's first platform with end-to-end confidential computing and it needs great CPUs. We've got it all covered.
Your next question comes from Stacy Rasgon with Bernstein Research.
I wanted to go back to the segmentation. So first of all, I'm just curious, where do you put the neo clouds across those 2 segments? Are they in hyperscale? Or are they in the AI cloud. Part of me assumes the latter, but I'm not so sure. And then -- but just magnitude of them. I mean they're bolted up the same magnitude now. It almost sounded to me like you were suggesting that you thought the latter of the AI cloud would grow faster maybe going forward than hyperscale. Is that what you're trying to say? Or do you do see like the same kind of growth coming from both segments?
First of all, you're correct that AI native clouds -- AI native clouds don't build chips, don't design their own chips, and they don't want to -- they can't really assemble unrelated parts together into an AI factory. And their time -- their patients, their tolerance for time to first token is extremely low and their need for an architecture that has a great deal of offtake so that it runs every model as customers from everywhere is incredibly high. And so that's the reason why NVIDIA's architecture is so perfect for them. We offer every component and whatever we don't offer our ecosystem of partners offer it, and it's all fully integrated. It all works together. The number of customers that could rent it from an AI native is incredibly high.
Basically, every single AI builder, every AI native startup around the world. SaaS companies, enterprise companies, industrial companies.
And so our computing -- our architecture is the most rentable of any computing platform in the world. So it's the most performing. It's the easiest to put together. It's the most rentable, has the best TCO and it's the easiest to finance. And so all of those properties are quite unique to the needs of AI native. It's in the second category. They're very similar to even OEMs and so on and so forth, large enterprises and so forth surprising, okay? So we put that in the second category.
If you look at that segment, it started growing after the AI ecosystem developed in the hyperscale. Hyperscale developed AI first for a lot of reasons. They have great computer science. They have excellent data center capability. And they also focus largely on consumer applications, which, if not perfect, is not the end of the world. It enhances the service so long as it enhances the service. And so for many of the other applications, industrial applications, enterprise applications, until the AI is very capable and thus really productive work and does it safely, and it could do it in a way that can actually generate impact in income, it doesn't really get used. And so you expect the second category to develop slower than hyperscale, and you could see that in the numbers.
However, long term, if you look at industrial and enterprise, clearly, that's where future economics is going to be because it represents some $50 trillion, $80 trillion of the world's economy. And so -- and it's going to be larger than that because of AI. And so I expect the second category to be larger over time, both in the near term over the next several years, I think it's a foregone conclusion, both are going to grow incredibly fast I expect the second category to still grow faster, but both are going to grow incredibly fast. And then I'm hoping that within the next 5 years, physical AI and robotics segment is going to grow incredibly fast.
Your next question comes from Jim Schneider with Goldman Sachs.
Back at GTC, I believe you discussed $1 trillion visibility into both your Rubin and Blackwell platform revenue. But I believe that excluded things like LPX, Rubin CPX and the Vera CPU [ Rex ], can you maybe give us a sense about whether the Vera CPUs are going to be the biggest source of upside above and beyond that $1 trillion? Are you contemplating other sort of combinations of products, including CPUs that would allow you to gain an even greater share of that total TAM?
In terms of -- in terms of incremental above the $1 trillion, I would say, one, the continued growing of share of the Frontier AI models. I'm expecting to grow more share. And so I'm expecting that to grow. Number two, we didn't include any Vera CPU, stand-alone CPU in that number. And so I expect that to be the second largest. The TAM is, of course, quite large in agentic systems, and all of our customers are quite excited about Vera and we're going to sell a whole bunch of Vera's. And then third would be LPX. because as I explained earlier, LPX is designed as a -- because of its SRAM architecture has the benefit of very low latency and very, very high interactivity, but it's also its throughput, its context processing ability is also quite limited.
And that's just kind of the nature of SRAM type base systems. And -- but the combination, we'll be able to address the entire spectrum of AI from pretraining to post-training to inference agentic systems through the combination of Vera Rubin and LPX.
Your next question comes from Joshua Buchalter with TD Cowen.
And congrats on the great results. Colette, I believe in your prepared remarks, you mentioned GB300 is sort of the fastest ramp in the company's history. How should we think about Vera Rubin against this benchmark. It's obviously a new architecture at the silicon level, but in similar rack. Does that mean we should expect a similar slope to the Vera Rubin ramp as the GB300? Or should it be a bit more gradual given the new silicon?
Yes. Well, we've indicated for a while that we will be launching Vera Rubin in the second half. We will start in Q3. That will be our initial pieces together. And then once we get to Q4, we're probably going to start to see our ramping continue. It's hard to say at this point what will be a faster ramp. But again, we have demand already planned, we've got POs. We've got almost all of our major customers ready to go, and these are very complex systems that we need to put together. So I think it's just about the timing that it's going to take for us to get that into market. Nothing else other than getting from production of all of the different systems that we have ready for order.
So a little early to say. But yes, we're going to start in Q3 and continue to ramp into Q4 and Q1 of next year certainly is going to be very big as well.
There are no further questions at this time. Toshiya Hari, I turn the call back over to you.
Thank you. Before I hand it over to Jensen, please note Jensen will be giving a keynote at [ GTC Type ] at [ Computex ] on June 1. We will also be participating at the TD Cowen TMT Conference on May 28 and the Bank of America Global Technology Conference on June 4. Our earnings call to discuss the results of our second quarter of fiscal 2027 is scheduled for August 26.
With that, here's Jensen to close this up.
This was an extraordinary quarter, demand has gone parabolic. The reason is simple. Agenttic AI has arrived. AI can now do productive and valuable work. Tokens are now profitable, so model makers are in a race to produce more. In the AI era, compute capacity is revenue and profits.
NVIDIA is the platform of this era. Of all the platforms in the world NVIDIA compute supports the richest diversity of demand.
Let me highlight my top 5 things. First, NVIDIA is the only platform that runs every Frontier AI model. With the addition of Anthropic to our existing partners, OpenAI, [ X-AI ], [ Meta MSL ], Gemini and many others, our share of Frontier AI is growing. Second, we are in every hyperscale cloud, supporting their core data processing and machine learning workloads, internal services as well as supporting their demand for NVIDIA users in their public cloud services. Third, our full stack complete AI factory solution and vast global ecosystem let us uniquely address new AI data center segments. New AI cloud new AI native clouds and sovereign AI clouds and on-premises enterprise and industrial infrastructure. This is the second category I was talking about earlier. Fourth, NVIDIA CUDA extends all the way to the edge, robotics, autonomous vehicles, embedded medical instruments, AI ran telco base stations. The next wave is physical AI with billions of autonomous and robotic systems operating in the physical world. This is the third segment we were talking about earlier.
And rounding out the top 5 things, we have a major new growth driver, Vera, the world's first CPU purpose-built for Agentic-AI. Vera opens a brand-new $200 billion TAM for NVIDIA, a market we have never addressed before. And every major hyperscaler and system maker is partnering with us to deploy it. The world is rebuilding computing for agentic AI and robotic physical AI. NVIDIA sits at the center of these transitions. We built NVIDIA compute platform over 3 decades, one architecture, vast ecosystem, extreme co-design across chips, systems, networking and software. We built it ahead at this moment so that when agentic AI arrived NVIDIA would be ready. It has arrived.
Look forward to catching up next time.
This concludes today's conference call. You may now disconnect.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Q1 2027 Earnings Call
NVIDIA — Q1 2027 Earnings Call
NVIDIA liefert ein außergewöhnliches Q1: hohes Umsatzwachstum, starke Data‑Center‑Dynamik, hohe Cash‑Rückflüsse und aggressive Kapitalrückgabe.
📊 Quartal auf einen Blick
- Umsatz: $82 Mrd. (+85% YoY, +20% q/q)
- Data Center: $75 Mrd. (+92% YoY, +21% q/q)
- Computing/Networking: Computing $60 Mrd. (+77% YoY), Networking $15 Mrd. (nahe 3x YoY)
- Margin: GAAP 74.9%, non‑GAAP 75% (weitgehend stabil)
- Cash & Return: Free Cash Flow $49 Mrd.; $20 Mrd. Rückfluss im Quartal; neues $80 Mrd. Aktienrückkaufprogramm
🎯 Was das Management sagt
- Neues Reporting: Zwei Plattformen – Data Center (Hyperscale + ACIE für AI Clouds, Industrie, Enterprise) und Edge (Robotics, Automotive, Workstations).
- Produktfokus: Blackwell dominiert Training/Inference; Spectrum‑X und XDR‑InfiniBand stärken Networking; Vera (CPU) und Rubin (System) sollen Agentic‑AI und Inference deutlich beschleunigen.
- Kapitalallokation: Erhöhung der Dividende (Q&A‑Korrektur auf $0.25/Share), zusätzliche $80 Mrd. Buyback; Ziel ~50% FCF Rückfluss.
🔭 Ausblick & Guidance
- Q2‑Revenue: $91 Mrd. ±2% (Wachstum primär Data Center)
- Margen & OpEx: GAAP/non‑GAAP ~75% ±50bps; OpEx Q2 GAAP ~$8.5 Mrd., non‑GAAP ~$8.3 Mrd.; Full‑Year OpEx +upper‑40s% YoY erwartet
- Steuer & Regionen: non‑GAAP Steuersatz 16–18% für FY; China‑Data‑Center‑Umsatz bleibt aktuell nicht in der Guidance enthalten wegen Exportunsicherheit
- Produkt‑Revenue‑Visibility: Weiterhin Vertrauen in ~$1 Bio Blackwell+Rubin (2025–2027) und nahezu $20 Mrd. sichtbare CPU‑Umsätze in Kürze
❓ Fragen der Analysten
- Segmentierung: Warum die Aufteilung? Management: Klarere Darstellung von Hyperscaler vs. AI‑native/sovereign/enterprise Märkten und unterschiedlichere Go‑to‑Market‑Modelle.
- Vera/Rubin Ramp: Nachfrage hoch, Start Q3 mit Ramp in Q4; Management nennt starke Kunden‑POs, Timing und Produktionskomplexität bleiben Unbekannte.
- CPU vs GPU: CPUs (Vera) sehen sie als komplementär (Orchestrierung, Tools, Agent‑Harness); $20 Mrd. Zahl bezieht sich auf Stand‑alone CPU‑Umsatz. LPX bleibt Nischenprodukt für sehr hohe Token‑Raten.
⚡ Bottom Line
- Kurz: Starke operative Dynamik und Cash‑Generierung bestätigen NVIDIAs Führungsposition in AI‑Infrastruktur; neue Produkte (Vera/Rubin) bieten zusätzliches Upside.
- Risiken: Lieferketten‑Engpässe, Unsicherheit bei China‑Exports und die Herausforderung, komplexe System‑Ramps termingerecht zu skalieren.
NVIDIA — Shareholder/Analyst Call - NVIDIA Corporation
1. Management Discussion
Good morning, everyone. As we quiet the back room, I have a very important job. As a reminder, the content of this presentation may contain forward-looking statements and investors are advised to read our reports filed with the SEC for information related to risks and uncertainties facing our business. And with that, I will turn it over to Jensen and Colette.
All right. Good morning, everybody. I hope you enjoyed the presentation yesterday, went a little bit longer, but I think it was an absolutely great summary for us, but we're going to take this time to focus on your needs and some of the additional kind of questions you are.
We're going to start with a couple, maybe the first slide or so, and then we'll open it up for questions. And I'm going to turn this over to Jensen with that.
Yes. As I was saying yesterday, there were 3 inflection points in recent AI. The first 1 was generative AI. The second was reasoning. And we're at the third inflection point now, and each one builds on the others. There's a lot of technical reasons why each 1 of them built on the others. But here we are with the third inflection point, which is Agentic systems. Agentic systems that are able to operate autonomously. That's why they call them agentic because they have agency, and you can give them goals. And instead of just answering questions, they can now perform tasks and tasks could be -- anything from, of course, one of the most popular applications of agenetic systems is write software. Engineers in your company, I'm sure, and engineers in my company for sure are using agentic systems all day long. And what used to be a thing for engineers is when you come to work, they give you a laptop. Now when you come to work, they give you laptop and tokens. And token budget is now a real thing. Every engineer is going to have a token budget. And the idea that you would hire a $300,000 engineer and they spend no tokens in doing their job, you've got to ask the question, what are they doing? And so it is very, very clear now that every engineer will have a lot of tokens that we have to consume, and those tokens are going to be produced.
Now I just said something a second ago, if you just connected the dots, we used to be when an engineer comes to work, software programmer, somebody comes to work, you give them a laptop, that's a tool. To get today, we give them a laptop and tokens. Those tokens have to be manufactured. And so a computer used to be just a tool, a computer of the future is a manufacturing equipment. And so these computers, as you see, there's no different than ASML manufacturing equipment in the future. They're producing something that is sold, just it's no different than a dynamo machine a long time ago that produced electricity. These are manufacturing systems and the energy efficiency of it, the production efficiency of it matters everything because it drives your revenues, okay? And so I -- the third inflection point is here. As you know, Open claw. Many of these things, when they first drop these open source projects when they first drop, they seem like toys, you take a step back. and just analyze what is Open claw on first.
[Audio Gap] Our Linux strategy just as we all had to have an Internet strategy, just what is your mobile cloud strategy. Now the question is what's your Open Claw strategy, okay? And so this is a very big deal.
The next I wanted to answer the questions about what I said here a little bit more. First of all, a year ago -- a year ago, I said that we had strong visibility of our Blackwell and Rubin shipments of $500 billion through 2026. I was standing in 2025, right? And so GTC 2025 is around was it March, April? It was October?
October.
Okay. October, I was standing there? You sure. It was October. GTC DC or GTC -- I said it twice, though. The first time I said it was GTC here, right?
I think you've been saying it twice. I don't think all the way back.
I see. Yes. Okay. Anyways. Anyhow, in 2025 -- 2025, one of those months, I said that we have strong visibility of Blackwell plus Rubin demand -- purchase orders and demand okay, very firm demand of $500 billion. And there were a lot of questions from many of you that -- so where are we now? And so you wanted an update on where we are now. And so I thought I'd give you guys an update, where we're standing right now and what month are we just for the record, March. And so here we are in March. Here we are in March, the end of 2027 -- the end of 2027, as you know, is many more months away. I just want to first let you guys know that. However, because we're building infrastructure and factories, and the lead times for everyone is long, they want to make sure they give us firm demand or give us -- purchase orders and firm demand as early as they can to secure their supply, okay? And so we have strong confidence and visibility, visibility and strong confidence of $1 trillion plus. There's -- it's not a floating point number, you guys, okay?
It is also not 94 digits of accuracy, okay? And we're not counting cents. You can keep your cents. However, we have strong visibility of $1 trillion plus of Blackwell plus Rubin. And the reason why it's only Blackwell plus Rubin and not all of the other things that we sell is because I referenced it from the last year when I was only talking about Blackwell and Rubin, does it make sense? So last year, we didn't have Grok. Last year, we weren't selling stand-alone CPUs. Last year, we didn't have many of the things that we have to sell now. And so it wouldn't have made sense for me to include those today and not because we didn't have those things yesterday. Does that make sense?
Somebody nod then I can continue, okay? And so therefore, a couple of things. It's only Blackwell and Rubin, it's not Fin. It's not Rubin Ultra. It's not any of those things. It's not Vera standalone, it's not Grok. So Blackwell plus Rubin, we have high confidence, strong visibility, demand, forecast, purchase orders of $1 trillion plus.
We closed businesses that we ship oftentimes, oftentimes, and we expect to close and ship more business between now and the end of 2027. We expect to close to close book and ship more business on top of this between now and 2027. And the reason for that is because we expect to be coming to work between now and the end of 2027. Now unlike other businesses, because we build and complete systems of this quality, we can actually win book, ship new business in the same quarter. Of course, you can't do that if you have to build an ASIC or obviously, if you don't see that -- if you don't see it now, you're not shipping it by the end of 2027, but that's not true for us. We built inventory. We have a pipeline of supply that -- and we have to take care of customers who come out of the blue because they're desperate for more compute. Does that make sense? And so when they're desperate for more compute and all of a sudden in last day, they say, goodness gracious, I could use more.
I would like to be able to say and we are always in a position to say, we'd be more than happy to help you. We're also working on new customers, new markets, new regions that we haven't put in here yet because we still have -- well, about 21 months to go, okay? And so I want you guys to understand what that $1 trillion. It's, by definition, going to keep growing. By definition, because what I compare it against, it will keep growing, and it will be larger than that.
A couple of things that I wanted to say also that last year was a really good year because 2025 was our year of inference, and I think we helped everybody understand that the price of the computer and the cost of the token, the price of the computer and the cost of the token are only marginally related. The price of the computer and the cost of the token. Remember, people are buying these computers to produce tokens. The effectiveness of the production of those tokens matter greatly. They're not reselling the computer. If you bought a computer and it's expensive, if you resold it and that's it, then it's expensive. But you bought a computer and it's expensive because the technology is incredible, but it produces tokens at such incredible rates, you have -- simultaneously have purchased the most expensive computer and produce the lowest-cost tokens. Does that make sense? This is what we do every day. This is our job. It is the reason why we deliver the value that we deliver, the value discrepancy that we deliver here, the 2 numbers that I just described is how we're able to secure our gross margins.
We have to deliver and we consistently deliver so much more value, which is tokens per second, which is tokens per second per watt. We deliver so much more value every single generation that customers would prefer to buy our next-generation product at a higher price than our current generation product at a lower price.
They prefer instantaneously to convert the moment that Vera Rubin comes, it is smarter to install Vera Rubins than to continue to buy Grace Blackwells. Are you guys following me? Somebody nod, okay? Because the value is better even though the price is higher. So I'm comparing these 2 systems because these are the 2 de facto systems in the world. And until you can beat these 2 systems, there's no point buying something else. And these 2 systems are incredibly hard to beat because Moore's Law doesn't give you 35x. So Moore's Law alone won't do it. Building a faster chip won't do it. You're going to have to build a faster lots of chips. And so last year was our 2025 year in inference, and I think we demonstrated our inference leadership, training, the post training to now inference. And then some of the other things that we did last year that was really great is we expanded the reach.
We expanded the number of AIs that now support our platform. Last year, 2025, we added Anthropic to our platform, which is net new. We added Meta SL, which is net new. We're still working with Meta on all of the other stuff. MetaSL is a net new entity and they have net new computing requirements. And we can all acknowledge that last year, open source software, open-source models really took off to the point where API inference service providers now see that open models probably represent -- approximately represent the second most popular AI model, meaning the large -- the first one, of course, is OpenAI into total number of tokens generated. In aggregate, open models represent number two. As you know, NVIDIA is the best platform for open models in the world. We are the standard for open models everywhere. And so number one, OpenAI, number two, all the open models, #3, Anthropic. Number four, XAI, just take your list, keep working.
I think NVIDIA's coverage of models last year increased substantially, which explains are accelerating growth at a very large number. We are already a very large company, as you know, and we're now accelerating our rate of growth is actually accelerating. And so anyways, that's -- I think about it.
One last point. We love our hyperscaler partners, and we work very, very closely with them. But it's important to understand that our relationship with hyperscalers is we're not selling or not just selling to them. We attract customers for them, having CUDA in their cloud brings all of the CUDA developers, all the AI natives, all the large companies that we work with, whenever we accelerate those large companies, those [indiscernible] small companies, we bring them, we terminate -- we have them hosted in the world CSPs. We are one of the best sales forces of the world's CSPs.
It is the reason why if you go down to the show floor, they have all of the largest booths. AWS has the largest booth here. Google Cloud has the largest booth here. Azure has the largest booth here. Oracle, giant booth here. Coreweave, big booth here. Does it make sense? Because we bring customers to them. Why are they here to talk to sell to my developers? And all of our developers only know how to program one thing. They only know how to program CUDA, and they only use CUDA X libraries. And when we help those developers integrate NVIDIA, they land on one of our CSP partners. We are one of the CSP's best sales forces, all right.
However, -- we are also seeing tremendous customer diversity outside of the CSPs. Regional clouds, industrial, enterprise on-prem when Dell and Lenovo and HP, they're all growing so fast and all the ODMs are growing so fast. A lot of that business go towards the right-hand side of that chart, the 40%. Most people see our business in the left 60%. That -- the right 40% without NVIDIA's full stack without our -- the fact that we can build [indiscernible] entire AI factory, and the fact that all of the world's open platforms run on top of NVIDIA, you have no hope addressing the 40%. So a big -- so the net of this chart is this, a big part of that 60% is NVIDIA developers landing in the cloud, 100% of the 40% is impossible without full stack, without end-to-end.
Was I successful in communicating that? It's important to understand our business. We aggregate that whole thing into what is called accelerated computing and it's probably a disservice to you. So next year, we're going to separate it out a little differently. Well, in the future, we're going to separate out a little differently, and it's going to look probably like this chart. You'll see something like hyperscalers or something like that and 60% of it. And even when you see that, remember, a lot of those customers we brought to the cloud. And then on the right-hand side, that 40% is completely impossible if you just build a chip because they don't buy chips, they buy platforms.
Three messages. All in 1 slide. which probably major brain blew up. And therefore, I did it again. Was that helpful? I should -- you know what I should have done, I should have made 3 panels or 3 slides. It would have been a 7-hour keynote, but it would have been worth it. Okay. That's it. Thank you. Questions.
We're opening up for questions now.
2. Question Answer
It's Ben Reitzes, Melius Research. Thanks for having us here for this event. It's amazing access that you guys provide. Congrats to you and the team for that. This is great. Jensen last night, when we took a picture, by the way, you all can still like that picture. I need to beat last year's record.
What picture?
We took a quick picture and I posted it, and I'm trying to beat last year's likes.
Okay. All right. All right. So was that in some vulnerable position or anything?
Let's put it this way, the camera added 10 pounds to me, but not to you. I don't know how that works. You look great. So I promised I'd ask you an inference question, and this is related. This is great, like I don't think a lot of people here get this. I think the main pushback we get is the juice worth the squeeze and will the hyperscalers have upside to their revenues for API and cloud that justify all the spend? And what is Jensen seeing? Because I have estimates for the hyperscalers and I've said there's upside to the revenues. But for now, the CapEx is 20% above their cloud API revenue. And I'm wondering what you're seeing. You've said in the past that there's this massive upside to these cash flows and from your customers, particularly hyperscalers and those that are serving Anthropic and OpenAI.
So when do we adjust those higher? I know this is a tough question for you because you got to guide for 3 or 4 other -- 5 other companies. But if we see that upside, I think your stock will behave a lot better because then we'll realize this build can keep going. So when is this inflect -- I mean we're seeing the inflection, but when is it -- what is the upside to their revenues? And how do we feel better about it?
Yes. So I wish those companies were public. And the reason for that is because then you'll see what I see. No company -- no companies in history has ever grown as a start-up company nonpublic company, as a start-up company, increased revenues by $1 billion or $2 billion a week. That's what they're experiencing right now. Now remember, just a week, the entire IT software industry is, call it, $2 trillion. That $2 trillion industry, I don't believe it's going to be disrupted. I think it's going to be transformed. I believe that every one of that $2 trillion IT industry is going to integrate a combination of OpenAI Anthropic and open models and turn them into connected with an open source software called Open Claw that we turned into an enterprise-ready version called Nemo Claw, and you have instantly an agent. 1.5 million people downloaded Open Claw and built themselves an agent. It's one line of code. And then you tell the agent to finish building itself.
So you don't -- you don't know this thing, go learn it and it goes off and learn it. And so in the future, those agents will be integrated into the IT industry. This IT industry is $2 trillion of software licenses today. it's probably going to be -- let me just pick a random number, $8 trillion that also resells an enormous amount of tokens. 100% of the world's IT industry will become resellers of OpenAI and Anthropic. Are you guys following me? No.
Take your estimates up for Open AI and Anthropic.
I believe that Anthropic and OpenAI. And of course, all of the IT company will also modify and customize their own software, their own models with open models and that's what Nemo Trans is for and that's what Nemo's for and all the -- we've created all the tools, and that's why we're working with all of them. They're all going to create agents that integrate these 3 components. And I believe they're going to grow incredibly. The time it's going to come soon. And the reason for that is you could see it in Anthropic's numbers, you could see it in OpenAI's numbers. They are growing, not -- they're growing an entire IT company in a month.
And the revenues of these AI companies. Their AI will be used by enterprise directly, but it's also going to be resold through IT companies integrated into IT companies. Does that make sense?
Yes.
Because just think of that AI is just software. Their software is going to be offered directly to enterprises, but it's also going to be integrated and become domain-specific and specialized governed, secured, easily provisioned connected to their system of records, so on and so forth. There's going to be a whole -- and that Agentic system will be rented to customers, but they still would have to consume tokens through factories. And so if it comes down through Open AI, that's terrific. It comes down through Anthropic. That's terrific. It comes down through open models, that's terrific, but they all have to have tokens generated.
So the net-net is IT companies of the past licensed software, IT companies of the future will rent tokens -- will generate tokens. Are you guys following me? Their business models will change. The companies will become bigger, their gross margins will change. Gross margin profile will change because they now have tokens in the -- they have COGS in their business model now, but they offer greater -- much, much more value. And so this is exciting for them, super exciting for them.
C.J. Muse from Cantor Fitzgerald. Thank you for hosting this event. Really appreciate it. wanted to, I guess, maybe follow up on Ben's question and think about the evolution of this chart of 60-40. You talked about Nemoclaw. And then you announced yesterday the Vera Rubin-DSX AI factory reference design, essentially providing the blueprint for your non-hyperscale customers to compete with the hyperscalers. So I'm curious, as you put it all together, you see a massive spike in token generation how you're expecting pretty much this chart to evolve over time and how we should be thinking about the different players inside there as to their relative kind of growth factors.
I think that this chart grows on both sides of it grows at similar rates approximately until the physical AI inflection happens in a few years. And so let's say, physical AI inflection happens, then the industrial side has to be done on-prem and it has to be done at the edge. It has to be done in location. It has to be done in the factory. Then all of a sudden, that 40% is likely to grow. And I think, ultimately, that 40% becomes larger. And the reason for that is because the world's industries that are related to physical AI is much, much larger than the industry is related to digital AI. Something like $70 trillion of the world's industries, 50, 60, 70 is requires physical AI because the world is happening not in our laptop, the world happens out where the world is. And so there's a lot of ADM-related businesses that simply can't be taken care of without physical AI. And so I believe and I hope that, that 40% actually becomes 70%, but both of them are going to be incredibly large because the world is going to produce tokens every single day continuously, it will not stop.
Right now, as we speak, all of our laptops, well, hopefully most of you laptops are kind of sitting idle, but in the future, the computer is going to be running 24/7 creating tokens because your agents are off doing work. Somebody -- I was reading one of the Reddit posts. Somebody's claw consumed 50 million tokens in a day. Now that sounds like a lot, but that's only $50 and if you had an agent doing productive work for $50, that's not bad. And so you could have somebody who makes a few thousand dollars a day, have a whole bunch of agents spending $50 a day, becoming a lot more productive. This is going to be the norm. I have at NVIDIA right now as we speak. And I'm hoping the person that I'm paying a couple of thousand dollars a day to is spending more than $50 a day of tokens. Are you nuts? I want you to be managing an entire fleet of agents doing your work.
And so I'm really hoping that somebody who makes $2,000 a day is spending $1,000 a day of tokens. And what I just said makes sense, and it's going to happen, and it's already happening in software companies all over the world.
Stacy Rasgon from Bernstein. I have a quick clarification [indiscernible] Colette and then Jensen, I have a question for you. Colette, just to clarify, I know you've talked about Rubin ramping in the second half. Grok sounds like it's launching in Q3. So am I correct in thinking that Rubin should launch with Grok because I don't think Grok goes stand-alone. And then Jensen, I want to ask a longer-term question from you. I really like the chart you put up the other day. It almost to be showed like sort of the extension of the spectrum of inference which drove -- which -- I mean, drove value from Grok. You used to talk about how GPUs were fully the way to go. We now see architectures like Grok are needed to sort of take advantage of that spectrum of insurance widens, low lanes becomes more important.
I guess I wanted to give me how do you see that spectrum evolving from here. Does your platform now have all the pieces that you need as we go forward over the next like several years and hopefully longer than that. What are the new types of workloads with inference that you see coming? And do you have all the pieces you need to take advantage of that? Is that something else that we still need to be keeping our eyes on as that grows?
So first, Stacy, thanks for the question regarding Grok and the LPX. We did communicate that, that would be also in the second half of this year starting, and we'll see where that looks once we get closer to the second half of the year. But it is in this current year.
It's going to say Grok shipping in Q3, I think yesterday.
Correct.
So what we're expecting. However, Verirubin is going to ship before Grok.
It will ship before?
Yes, yes. And the reason for that is because we're already in production of Vera Rubin. Systems are already going through lines and -- and so at the moment, that's the condition, right? And so -- and it's okay. It's just fine. Varirubin is extremely hard to beat even for Grok. Even adding Grok to Vera Rubin is very tough to beat varirubin. And I'm going to explain your question in a second. It turns out in computing. You have it's not completely true, but it's close to true that you have 2 types of architectures, 1 that are extremely low latency -- 1 that's extremely high throughput, 1 that's extremely low latency. And in fact, a CPU is a low latency computer and notice the size of the cache on board, the SRAM. Grok is an extreme version of that, hyperextreme version of that. where the SRAM occupies basically nearly the whole chip. And the scheduling is done completely statically, meaning the compiler figures out where the data and where the compute is and then makes them meet just in time. And the whole Grok system is like 1 giant synchronous machine. As a result, it is deterministic. It's extremely low latency. It is not easy program. It is not flexible. It's not general purpose, but it is what it is. And so what we've done is we've taken Vera Rubin which occupies yesterday, I described about 3/4 of that space, Vera Rubin is the right answer. We don't know how to make that better. If we knew how to make that better, we would have made that better. NVLink 72 and the Vera RUbin Ultra NVLink 144 and VimanLink1152, is going to keep expanding the aperture of that left-hand side where high throughput matters tremendously. We're going to add Grok, fuse it with Vera Rubin, fuse it with our GPUs and use Grok to process the very last stage of auto regressive models, which is used for language models.
That last stage is extremely bandwidth-intensive. And if we ganged up a whole bunch of SRAMs like thousands of Grok chips, okay, it's 8:1. So for that last 25% of the power and that last 25% of the use case because your data center has all kinds of different use cases. It's not just one, right? We're all using ChatGPT. We're all using it in different ways. We all have different tiers of pricing. And so we're in different bands in my graph. We're in different bands in that graph. Are you guys following me, Stacy?
So there's -- I showed the 0 tier, the free tier, good, better, best, extreme version. And so for free good, better, Vera Rubin is untouchable. We can't think of anything close by. And then for best and extreme probably the best in extreme adding Grok to that, you could increase your throughput on the best, and you could extend the extreme version even further. Now the extreme version is now introduced a new tier, but your volume because the throughput curve, your volume is so low. You can't afford to make that demand too high. So you have to set the price quite high. Does that make sense? However, there's a new class of customers who is very, very rich software engineers. They already cost so much money that if I added to them $100 a day of inference cost, token cost, I'd be more than happy to do it. If I added even $1,000 on crunch time, more than happy to do it. Does that make sense? And so I'm simply describing what's happening to a market that is, if you will, maturing. In the beginning of the market, nobody knew the technology wasn't mature and people didn't know exactly how to use it, 100% of the early inference customers were free tier. And as the technology started to reach '01 and '03, all of a sudden, the paid tier skyrocketed because people are now able to use it for something useful.
Then all of a sudden, when agents came. Now, for example, cloud code, right, Codex, those tokens are a lot more expensive than free tier, and they're a lot more expensive than $20 a month. And so that segment, we just added 2 more segments. Did you say see that? And so this is no different than iPhone in the beginning, there was only 1 version. And now there are a whole lot of versions, no different than the car industry, no different than any industry. As the market expands, the segments expand. I showed a factory that is able to produce tokens of different segments and different tiers from very, very smart, incredibly fast to high throughput free tier. And I described an architecture of AI factory architecture that allows you to address the whole thing to maximize ultimately the total revenues of the factory and we let you decide how you want to mix and match. My estimate is it's probably about 25% today for, call it, a handful of companies. you have to be one of the -- you need to have -- you need to generate a lot of tokens to make it worthwhile. And so -- and then there's a whole bunch of -- they call them inference service providers, ISPs, API service providers.
I think they could also benefit from this, okay, because they would like to have a different segmentation of token generation and so I call it a group of 10 customers and 25% of that 10 customers represents a big part of that pie, we can increase our total revenues with Grok by 2x on 25%, 2x by 25%. Does that make sense? So say, 25%.
And I mean as you continue like with new versions of Grok with new generations, so what does that do? Are you pushing that out even further? Or are you lowering the cost and increasing the demand? Like I'm just trying to get some feeling.
We're always doing 1 of 2 things. We're pushing the throughput at every tier up and we're always pushing the smartness of the AI out. And so you see the pareto. I'm always pushing it up. I actually did the transition showing you guys from hopper to Blackwell to Vera Rubin. So I'm always pushing it up, and I'm always pushing it out. Whenever I push up the production volume of your factory goes up at every price point. ISO price point, the volume goes up, okay? When I push it out, you can introduce new tiers of AI, new tiers of tokens. And therefore, you got new price point today. Price point of, call it, $6 per million tokens. That's kind of where the world is. We really like to be. I know they would all love to be $50 per million tokens but super large models, super fast.
Could you imagine a $10 trillion parameter model running at 500 tokens per second. Our engineers will pay big money for that, and I would let my engineers pay big money for that. And so that world wants to come and then the next year will come again, because the models will get bigger, they'll think more, they'll use more tools and things like that. It's just like back in the old days, I don't know how many of you are new NVIDIA in the beginning, but we had 1 product, REVA 128, Reva 128, $299. That was it. One product, those good old days. And then today, we have 5090, 5080, 2 different SKUs, 5070, 3 different SKUs, 50 -- are you guys following me? And all of these SKUs exist because the market got larger and it started the segment and people wanted different things.
The market is exactly do the same thing with tokens is getting larger and larger in different segments wanting different things. And so I need to -- we need to help the customers. We need to help our model makers produce, manufacture different segments of tokens. I know they look like numbers, but they're different AIs. Makes sense?
Got it. It does.
Yes, so incredible. So we're going to increase the throughput, and we're going to increase their pricing simultaneously, that's the benefit of Vera Rubin. And we did that every single time. We did that with Blackwell. We did that Vera Rubin. We're going to do that with Vera Rubin with Grok. We can do that with Vera Rubin with Ultra. We're just going to keep pushing that envelope and ultimately, the simplistic way is that Pareto chart because the factory is a lot of different workloads and different customers. That Pareto chart, we want to push the Pareto frontier out -- up and out, constantly up and out, constantly up and and the computer science are necessary to do that insane, the hardest problem of all.
Vivek Arya from Bank of America Securities. Thanks Jensen, thanks Colette for hosting us and for a very informative event. I wanted to ask actually 2 related questions. One is in this $1 trillion Jensen that you showed. You have other products also that you spoke about yesterday, right, the Vera CPU, right, other CPUs, you have Grok. You have a storage solution, right, CPX prior to assume. So how much of that is incremental, right? Is it a small number? Is it a medium like -- how much more is that addressable market that is not captured in this $1 trillion, assuming it is incremental to this? And then I wanted to double-click on Grok again, Jensen. I think you mentioned that it will take up 25% of the inference. That's a pretty big statement. And is it cannibalizing something? Is it -- what is kind of the value capture from Grok over time? And a lot of people ask us, is it cannibalistic of high-bandwidth memory demand?
I don't think it is, but I would love to hear your view on how to kind of put Grok in the value capture, right, part of the spectrum.
Okay. We're the only company in the world today that can optimize an architecture on AI factory across 3 memories, of course, HBM memory, but we're the first to use LPDDR5 which is extremely high bandwidth and very low power. And that changes the equation for CPUs. And the third is SRAM. We can now utilize all 3 memory types to create the perfect architecture and we are, okay? That's number one. We used to offer just MVL 72 Grace Blackwall. That was our rack. We have 1 rack. We now have 5 racks as you know. And the reason why is because can you go to the next slide? Thank you. That was previous. Yes. So let's go. No. Back. There you go. Is that the one? Yes. You see that. This is what MVL72 did. It ran that. Are you guys following me? It ran all these large language models. This is what it was designed to do. And all of our inference stack ran that.
But remember, when an agentic system is, it runs this. This is what Claude code now do. This is what codecs now do. It runs all of this. It has memory. That goes into the kv cache. It has -- and that's on the STX system. This memory has grown so much that it needs to be accelerated. It's just too much -- all of our working memories, every time we use it, the more we use it, the harder the problem we solve.
This is structured and unstructured data. This is where I started the keynote with [indiscernible]. The stuff that nobody ever talks about, which is value incredible in the future because this agent is way faster than a human and it's going to bang on that way harder and faster. Does that make sense? And then tool use, web browser. And so a web browser runs on a CPU. And so you need a CPU to give the agent access to tools. And then it spawns off subagent and who knows what this could be.
One of the sub agents could be COPT, which is GPU accelerated. There are some subagent could be Omniverse, GPU accelerated. And so we need those kind of GPUs in the data center. So the way to think about what is Vera Rubin, Vera Rubin as a system expanded tremendously because we went from processing that, which is -- it's still 90% of the workload to processing all of this. Are you guys following me? This is AI. This is where ChatGPT started, but this is where it is now. Can someone nod? You guys get it? Okay. Give me a thumbs up. All right. Thank you. And so because I'll do it again. This is like -- sometimes our keynotes run long because I look in the audience, and there's some person sitting in front of me that's like they look lost.
And so I just -- I'm going to have to do this again. I don't leave nobody behind and so this is an agent. So what just happened. In our data center. That data center doesn't want to be cobbled up Frankenstein, and wants to use -- it wants to use elegant power delivery and cooling systems. And so we took all of the computers that's here, and we put them into the MGX Rec and we designed the world the perfect processor for each 1 of these things and just rack them up.
Does it make sense? And so -- and if you're going to -- if you're going to put storage, which is right up there in here, if you're going to put that in the east, west, which is in the same aisle as the compute, you better make it so it's not a Frankenstein outfit. You can have liquid cooled in NVLink 72 racks. And then air cooled, you can't have 300 kilowatts here and then use 50 kilowatts here. It makes no sense. And so we took the whole thing and we harmonized all of it in 1 single rack architecture.
And so if you want to build a cluster to run that uses connect them all up. It's incredible. Same power delivery, same cooling system, all 100% liquid cooled, all completely optimized for the workload, all fully accelerated. And so now your question, in order to run this agent and be able to offer all the things that we were just talking to Stacy about, you would increase your CapEx, you would increase your compute spend, the GPU compute spend by 25%. And so you add Grok to 25% of the workload. And you by 8x as many chips, which is approximately the same price as the NVLink 72 racks, okay? So 25% is multiplied by 2, and that's the same as 25%, okay? And so your 25 -- your compute spend goes up by 25%. That's the first one. And that's not in the $1 trillion. And so if 100% of that $1 trillion now adds Grok, then it will be $1.25 trillion, okay? And then we also have storage, which is a lot because storage as you know, there's just a lot of storage in the world.
It is the second largest compute spend. And then the third will be CPUs for tool use. But I'm not expecting CPUs to be that much and call it because just CPUs just don't add up too much, okay? And so you could say CPU is another 5%, okay? So if you were to say, all in, the difference between Grace Blackwell racks, which as you saw was however big it was and the Vera Rubi racks, okay, if it added another 50% opportunity, I think that's probably not far off. Did I just kind of reason threw it for you? Is that -- everybody got that, okay? And so that's the fundamental difference between the Grace Blackwell go-to-market and the Vera Rubin go-to market. Because we were solving in the Grace Blackwall world, inference. We wanted to be inference king, who doesn't, right? And so that's what we're solving. Vera Rubin, we're solving for this.
That's why I said Open Claw is completely transformational. Finally, we have 1 piece of software that runs across this whole thing. One open source software, it is the operating system of this chart. It's incredible. Now every company in the world can go build this.
Joe Moore from Morgan Stanley. You're generating $1 billion every couple of days, which seems pretty good. Can you talk about the uses of that cash to build strategic advantage in your business? You're making investments in ecosystem partners. You've got purchase commitments on components, you're also returning cash to shareholders. How do you balance those priorities?
Well, the priorities have to go, number one, it has to fund our growth. And our supply chain, we work very closely with, and we're in a great place with our supply chain today for a good reason. And it's because we work very long term with them. We help them plan their business. We award businesses to them to support their growth. We even prepay and sometimes we'll even fund their capacity with them growth. But we're preparing for $1 trillion over the next -- I'll just have to be very clear for $1 trillion plus through December 25th, I think we probably shut it down at 04:00 p.m. And so through that time, Pacific Standard Time. There's a lot of caveats in there just make sure. But anyways, the plus and so that's number one. Number two, we invest in our ecosystem because, as you know, the CUDA developers and the growth of this AI natives in this stage is really important. And then after that, we're still going to generate quite amount of free cash flow. And so well, I'll let let Colette answer it. I mean we have a good plan. So go ahead.
Yes. So with the strong growth that we have at the $1 trillion going forward, that gives us, of course, a very good position in terms of free cash flows. He talked about some of them upfront in terms of making sure that our suppliers and everything that we need to do is build is an order, and that may take some prepaids.
The second thing is our investments. We are still working in terms of with our commitments that we made over the last year that we need to do in the first half of this year. But once we move forward, and complete those, we do have an opportunity for stock repurchases and focusing on returning capital to her to our shareholders. It is still a very important part of our work that we are going to do. We had a good year last year, and I think we're going to have another great year in terms of what we can do in terms of returning capital to them. Do you want to give certainty on that.
It's up to you.
Okay. Where we stand right now, it is probably not taking into account the plus sign -- not take in account the plus sign. We will probably be at 50% stock repurchases and dividend together as a percentage of our free cash flow. So that's where we're starting out. And as you can see, the plus sign is real. And then that goes give us an additional opportunity to even do more. The timing of it, again, remember looking through what we have to do here in the first half of the year with some of our existing commitments, but stay tuned.
It's Tim Arcuri at UBS. So let me preface this by saying that this is not what I think, but this is what I hear from a lot of folks out there. So there's some concern that you're capturing too much of the value of the ecosystem and you can -- and that you can't sustain these margins over time. So how do you respond to those concerns? I know you see stuff online about having to invest in the ecosystem and people sort of spin that in a negative way. So can you just talk about how you can sustain your margins?
First of all, almost everything I told you guys yesterday is a new perspective. It is not illogical. That everybody has to understand tokenomics. It is not illogical that the world needs to learn what a computer has become. If we deliver, if we continue to deliver x factors -- x factors of tokens per second per watt every year, if we continue to deliver x factors of ASP increase for them because we introduced new token segments. Customers will be more than delighted to continue to do work with us. And it is -- it's also true, and I've said it before, and the math is absolutely clear. Every CEO of every cloud service provider, I would challenge them all to go and create that chart for themselves. And I'll help them. And you pick your favorite other configuration. You pick your favorite other configuration, third-party chips, built your own chips, and you put it into that model faithfully and then you can decide would you like to have higher revenues or lower. Would you like to have higher ASPs or lower, would you like higher margins or lower because that's all it means. Look, TSMC's wafers are the highest in the world, but they're the best value in the world. And I gladly pay for it. And so the idea ASML systems are the most expensive in the world, they're worth it. There's no question about it. And so the question is simply, do you want to make more money? Or do you want to buy the lowest cost equipment? Do you want to make more money? Or do you want to buy the lowest cost equipment? That's the difference.
Now what I just said is a new concept, and I think we can all acknowledge that. I just treated a computer system. The way I treat TSMC chip factory, the way I treat ASML manufacturing equipment. And that's not the way people thought about it in the past if I have 2 CPUs, 1 of them is 256 cores, the other 1 is 256 cores. Tell me which one is the better one. Well, the cheaper one's the better one because I'm running it by the core anyways. But that's not the way tokens are created. You don't rent by the core, you monetize by the tokens per second. And so it's a different economic. Does it make sense? You're not renting cores, you're not renting nodes. You're producing tokens, which is the reason why everything changed. It was necessary to make sure that everybody understands the economics of the new world. So we are -- anybody who says that simply does not understand the business, that's all. They're trying to buy the lowest equipment, lowest cost equipment.
My equipment costs 30% cheaper. What does that mean to your factory? What does that mean to your factory? That's really the question. And so I think people -- anybody who says my chips are 50% cheaper. Put that in the context of the factory, and that person is actually demonstrating to you they don't understand AI. They're just saying somebody goes, I'm 30% cheaper, you don't understand anyone I'm 40% cheap, you don't understand AI. My chips are cheaper. You don't understand AI. I'm not talking about anybody. I was just saying. It's a theoretical comment.
Josh Buchalter from TD Cowen. Thank you for spending the morning with us and there's a lot of customers and partners that are after your time, so we appreciate it. I wanted to ask a question. You said a few times, I think, yesterday that you expect to be short capacity in the 2027. Can you elaborate on where you're seeing those shortages? And on that note, you've described yourself as the chief revenue destroyer. And Satya's made some comments about not wanting to over-index to 1 generation. There's another 1 coming very soon. Is that behavior unique to Microsoft? And are these constraints sort of protecting it.
By the way, Satya would also tell you who told him that. Exactly. I told Satya, buy what you need this year because next year, there will be something better.
So I guess my question on that is, is TSMC constraints or the capacity sort of protecting your other customers from doing that? Or do you see them holding a similar mindset as Satya's.
I think I don't want you guys to thinly slice and dice our choice of words. Is the world supply constrained at some level, yes, right? Can we all agree, saying the opposite is weird. Hi. Is the world constraint on cars? Well, you see cars in -- would I have tripled the demand? Yes. And so everything is somewhat constrained. It just depends on everything. And because we're building at such a large scale, our life is just not simplistic. It's not so simplistic as I say, "Oh, I can -- if I just solve this 1 problem, that's it." Life is good. We are working multiple dimensions across multiple suppliers and making sure that things are in harmony -- you don't have too much.
We don't have too little. We can meet our demand plus. And the reason why we want to meet our demand plus is because there's always new demand coming for the next 21 months. I got a whole bunch of new demand that's coming. And so I got to prepare for that.
And so the all kinds of parameters and not simple. And if I told you that we are supply constrained on this 1 item, then I know what you guys are going to do. You know -- so I think the system is harmonious. Nothing is too much, nothing is too little. We don't have too much power. We don't have too little power. We don't have too many construction workers. We don't have too many plumbers. We don't have too few plumbers. We don't have enough -- we don't have too many cables. We don't have too many optics. We have -- we don't have too few optics. We don't have -- are you guys following it's just kind of right there, and we'll work it every day Perfect.
Aaron Rakers with Wells Fargo. Thanks for doing this as well. I'm surprised we got to this point without this question being asked and it's more technical. There's a lot of discussion.
You know what, we're kind of like the Fed now. Did he say near or almost. And what did he mean by -- well, we've got to do all of his previous transcripts. And when did he use that word? And what here's what I know. Demand is accelerating at a very large scale. And we'll be able to support the supply.
Perfect. So I was going to ask about architecture. I've gotten a lot of questions about yesterday's presentation where CPO starts where copper ends. You outlined NVL 576, there was NVL -- or 1152 on the slide. So I'm curious of what is your current thought process around offering both. And how does that evolve as we scale to Vera Rubin ultra refinement, just curious to your thoughts.
Okay. Please treat my partners properly. They're all doing great, okay? I'm not saying anything here that suggest any of their businesses, I'm going to go the other way. All of their businesses are going to grow because of us. We're going to grow copper. We're going to grow optics tremendously. We're going to grow copper. We're going to grow optics tremendously. Now did I say something that is completely logical? The answer is yes. And let me tell you why. We should scale with copper as long -- as far as we can as long as we can, but at a meter plus or minus, it's kind of the limits of copper, okay? And so you've seen us go from NVLink 72 to now Rubin Ultra NVLink 144, right, where the back plane was designed to be able to support that, okay. So that's kind of approximately -- and we're going to keep working on our series and if we could extend it from 144 to 288, we'll be more than happy to do so because you should use copper for as long as you can because copper is just easy to manufacture. It's more reliable. We've been manufactured for a long time. Humanity has been using it for a long time. And so did I say anything that's illogical to anybody? Everybody makes sense. You should breathe air for as long as you can until you out of it. After that, we'll breathe like compressed liquid air. But until then, how about air.
It's free. We've been using it for a long time. It's safe, all right? And so one, we should scale up with copper as long as we can. As you know, we also took Ethernet to a structure cable backplane. So that's incremental growth opportunity. Did I -- isn't that right? I just said it yesterday. We're going to take the backplane of Ethernet, and we turned it into these spines because these structured cables are really easy. Now that we got -- we mastered how to use it and manufacture it is, it's a real artistry we now can create these things and you -- it's easy to maintain, it's easy to ship, easy to wire it up. You make no mistakes, right? It's fantastic. However, simultaneously, we want to scale up beyond 72 to 144, right, to 1152 and maybe even further than that someday. And there's a limit to how far copper can go. And so you could see we're 100% copper now. The next-generation ultra will have 2 options. You could copper or copper plus CPO, copper or copper plus CPO, copper or copper plus CPO -- because I have 2 options: copper plus CPO or copper. Okay. That's 1 year from now, 2 years from now, at 1152, it's all CPU because there's a limit to how far it could take copper.
And so there's a transition. However, even when MV Link is CPO and Spectrum XPO. We will still have copper for the Ethernet scale up on our racks. We will still have copper for our storage. We will -- does that make sense? Because we have 5 different racks and so the amount of copper we will use will continue to be high because even though scale up will go to CPO in 2, 3 years, the total consumption of copper connectors is going to continue to grow because our demand in our total capacity continues to grow with all these different other racks. Was I got to select the words. Yes, perfectly.
Jim Schneider, Goldman Sachs. Thanks for taking the question. You previously talked about the spectrum of token costs and very helpful to hear the 25% of that in the high tier. How do you see the market evolving over time in terms of growth rates of the lower free tier versus the high tier in a market that's been sort of predicated by big decreases in token costs coming down over time. How do you see that trending? Does that start to slow or potentially flatten out and why?
Token cost is going to keep on coming down -- can we go to the next slide, Colette. Like token cost is going to keep on coming down. every single year. This is just Grace Blackwall and then Rubin token costs will come down again and Ruben Ultra token costs will come down again, okay? Meanwhile, the token smartness the smartness per token is going to keep on going up as well as we extend that curve to the right, okay, the X-axis. Meanwhile, we're going to increase the throughput. This is everything that has to be nobody cares about tokens per second. You always have to divide it by what. And the reason for that is because your data center is only so big. Your data center, it's a gigawatt, you're not going to have 2. If it's 200 megawatt, you're not going to have 3. Does it make sense? And so you always have to normalize it. Otherwise, no architecture, you can compare nothing. And Moore's Law was always divided by something, okay? So you have to take tokens per second per one. Anybody who shows you anything else just doesn't understand anyone, okay? Or they're trying to see you somehow, all right? So that's the reason why someone analysis did it right. They did it right.
Everything was divided by one, okay? And so we're going to keep on increasing throughput. So whatever -- this is the price of a token, whatever the price -- whatever that ASP is, we increase its throughput. Whatever the ASP is, we increase the throughput. Does that make sense? And then here, whatever that segment is, we reduce the cost. Whatever that segment is, we reduce the cost. So this is kind of like this down here is essentially your segment, product segment. And that's through how many -- the volume production and that's the cost of it. These are the 2 -- that's why these 2 curves are so important. Now I combine those 2 curves, you can combine those 2 curves. If you like, but it's -- it makes your head blow up. But this curve is essentially the Pareto. This -- and we spend -- in fact, most of the world today is simply right here. This is the hopper world. You see that, Hopper is kind of right here. Blackwell extended it and added a couple of segments. And this is really valuable, and people love that because the ASP difference between here and here could be 5x, 10x makes sense, larger model and faster, okay? And so these are really valuable. Now how do I see the curve changing, demand curve changing? Yesterday, I used 25% here, 25% here, 25% here and 25%. That's all I did. But a supplier's -- a manufacturers' distribution of different product segments, just kind of depends. Do you guys see the I'm saying? It kind of depends. Ferrari is kind of all high end, nothing in the free tier. And then somebody else, right? Just depends on the brand. And I think it's going to be the same here, guys.
If your business is search, you're going to be largely free tier because nobody pays for search. So if you're a search business, you're going to be largely free tier. If you're cogeneration, if your code -- agentic code, you're going to be a lot here. If you're an enterprise worker, and the average salary of that person, let's pick a number, say, 50,000 or 70,000. You might be here to you want your product -- if your customer is that person, you want your product price somewhere here. Does that make sense? It depends on your customer and the work that you do for them. It depends on the customer, the work you do for them and the competition. Those 3 things matter. It's just exactly like products. AI tokens or products, a new commodity, and we market it as such and different suppliers, different brands, different target markets are going to have different shapes. I just simply chose an equal distribution yesterday. Makes sense?
Yes, just which segment do you see is growing faster in the future?
They're all going to grow really fast at the moment. It just -- I don't think at the moment, it just doesn't matter. They're all going to grow so fast. They're all growing exponentially at the moment, every 1 of them. We're at the beginning, right? The growth rate is divided by a very small number.
Mark Lipacis, Evercore ISI. Thanks a lot for joining the Q&A. I always love the insights Jensen, our field work is telling us that AI engineers are getting excited about state space models because they address memory requirements. And in your keynote, you showed [indiscernible] 3 is benchmarking in 1 of the top models and I believe that's a hybrid mixture of experts, state-based model. And I'm wondering ...
Impressive. I'm was trying to ...
Thank you. Jensen. In the new AI workloads have led to the adoption of different AI models.
That was my darth vader imitator. Impressive. Young Jedi.
So the question is, is Agentic AI creating a new demand -- a need for a new AI model. Is that what you're doing with Nemotron and the hybrid, what does space get you for Nematron-3I that pure mixture of experts did it? And what are the implications on the competitive environment for NVIDIA if there's this transition to a new kind of AI model?
We run all AI models, whether it's full transformer discrete tokens, continuous diffusion state space, hybrid, our architecture's beauty is that it does it all. For example, Grok do diffusion models. But we can do everything. Does that make sense? And so I'm picking on rock, not because I'm picking on drug, it belongs to me now, so I can say these things. And so but every architecture has its place. The reason why NVIDIA is so versatile and the reason why it's used so freely everywhere is because irrespective of what innovation your research scientist come up with tomorrow, I promise you it's going to run great on CUDA. I just promise you that. And the reason for that is because we know we have all of the necessary computing elements to do all of it, okay? And so it's Nemotron-3I was designed so that you can deal with extremely long context. And in time, the AI models, we're going to -- you're going to have conversations with your AI hopefully for as long as you shall live.
And so the question is how do you deal with context how do you deal with the relevant conversational memory so that on the 1 hand, if you memorize everything, and we talk about something over time, which version of that memory do you pull back. When you have too much memory, over time, it could become garbled. And maybe a reset is helpful. These are research areas, long memory areas or really research areas. But the hybrid architecture, I think, is going to be a very major thing because it allows you to deal with extremely long context and not have to suffer the quadratic explosion in computation. And that's the reason why we invented it and we put it out in open source, and it could -- we love for everybody to use it. And so it's intended to advance AI not to compete with anybody. We don't need to. We just -- we just want to advance AI.
Thank you, Jensen. So I'm trying to understand how concentrated your downstream like the AI market is and is going to be and so you have this chart showing 60% is hyperscalers. But I'm kind of thinking the other 40%, the majority of that is Tier 2 cloud and a lot of them are actually reselling or renting their capacity to hyperscalers or to the frontier labs. So if you take hyperscalers plus frontier labs, it might be like 80% of people actually using the infrastructure that is being deployed. So that's an element of concentration and then these models like Anthropic models, the Open AI models, et cetera, seems to be like a very small handful that are really at the frontier. And so do you think that's the right description of the situation today? How do you see that evolving? And maybe what does that mean in terms of right to make money in the value chain and development and like further acceleration of AI?
Okay. So I would slice it into 3 dimensions, okay? And as you were talking, I simplified it as much as I can into a cube into 3 dimensions. The first dimension is what is the end model being run? And I said earlier, OpenAI is the largest. The second largest by category is basically all the open models. In aggregate, is by definitely solidly #2. And then number 3 would be Anthropic and then so on and so forth, okay? And so -- and that -- and the tail is actually -- is fairly long, okay? And so if you look at the world of model consumption, even just language, that's the way to think about it. And we run all of them. We're in all of them. That's 1 dimension. In that sub dimension of models, you have to decide to add also physical AI models, which is robotics, like all the robots you saw, they're not running they're running vision language action models. And those models are different than just language models. And for example, the control of motors is continuous. It's not -- it's not like a character. It's not like words, it's continuous.
And so physics is continuous. Biology has based geometry because things chemicals, a base geometry, okay? And so there's a lot of different types of models. But point being that you have to first think about the different types of models being run, and that's helpful to how you think about the write to -- write to business. The second dimension is are they -- are the computing -- depending on the way that the companies are structured and their intentions or interests, they are either companies that want to build their own chips, and we have to compete with companies that want to host NVIDIA customers in their cloud and obviously, CUDA only runs on NVIDIA CUDA and then are they companies like, for example, NCPs, where they need us -- they can't just buy chips, they really have to buy systems. And so they're really infrastructure customers. Or are they companies that want to build on-prem. Therefore, my distribution channel goes through Dell and HP and Lenovo because it has to integrate a whole bunch of other enterprise computing components and Dell and HP, they don't build their own chips. Or are they at the edge and maybe their radio networks, maybe the robotic systems or self-driving cars or satellites and so on and so forth, doesn't make sense.
Now you got to decide where where is the computing being done okay? And so there's kind of the several dimensions, I guess, you could think about it. And when you're done subdividing all of that, you come back to the chart that I showed you is 60%, 40%. Within that 60%, 40%, 40% of it basically, they need computing. It doesn't matter what models they run it could be open models could be Anthropic models. The fact that NVIDIA supports confidential computing makes it possible for Open AI to run on the right side at all. we make it possible for Anthropic the right to run on the right side at all because we have confidential computing. That side, they want entire platforms, they want confidential computing.
They want computers at different parts of the world, not just in the cloud. Even in the cloud, we compete with some part, but we also bring customers to the other part. And so some part of that chart of 60%, we have to compete. And our job is just to deliver that chart better than anybody else in the world, and we're doing very, very well, and we're actually increasing our position day in and day out. And then the other part, we bring customers to them. They're just grateful.
Makes sense? So I took all of that dimensionality and I compressed it into basically is 2 slices of the pie. And so that compression, I think if you test against, do they design the -- do we -- does NVIDIA compete with them on chips? Okay, there you go. That's interesting. And then you got to figure out where are we in our position and what's our opportunity and so on and so forth. I don't think OCI will design their own chips. I don't think it's sensible for them to do it. Obviously, Core's not going to design their own chips. And so there, we -- so where do we compete and where do we bring the cloud service provider customers? And their cloud revenues, a lot of them, a lot -- a big part of it, obviously, I nearly 100% of that is because of NVIDIA, right, with OpenAI.
We'll take our last question.
It's Timm Schulze-Melander from Rothschild & Co Redburn. Maybe just a question around how you run the company, Jensen. And looking ahead, this 12-monthly flywheel is part of your competitive advantage. But when I look at headcount, actually, it seems to be growing very slowly, relatively slowly. And yet the undertaking that you are going for is growing much more rapidly than that. How do you manage that or prepare for that going forward? And how do you manage maybe the risk that, that could pose to your business?
Yes. As you know, I have 60 people on my direct team. And the reason why we need 60 people is because the company's architecture was designed to deliver on this architecture on the products. The organization, the architecture of a company should reflect the products they build. Every company should not look -- have the same business org. And I look across and I said, "Oh, look, they have a business unit here. They have a business unit there. They're a business unit there and yet they want to build what we want to build." What you build as a company, for example, the way -- not because I've seen it, I've read about it. The way you build a Ferrari and the way we build it a Ford is very different. In 1 case, you move the car in the other case, you move to people, okay? And so the car stay stationary. And so it depends on the results of what you want to create, the architecture should reflect it. If you look across my management team, every aspect of the technology necessary to build Vera Rubin's entire factory is right there, 100%. Everybody is representing. All of the expertise sitting at the table, making a decision together. And the second thing is we had the discipline to develop the entire software stack. You can't build what we build on a yearly basis if you can't bring it up. Have you guys following me? It's very logical. How do you test it if you can't bring it up? And if you're cobbling up new technology from everybody else, how do you bring it up once a year. It's just not even practical. It's not possible. So we align all of our chips to the platforms -- all 7 chips, they only have 1 tape-out schedule. I don't cobble up everybody's tape-out schedule and figure out when the system comes. The system comes when the system needs to come, and everybody aligns to it. And the software stack, we completely own every piece. The storage, that's the reason why we developed it, networking, of course, all of the even the factory operating system we call Dynamo, we created everything.
So that we could deliver every single benchmark, test everything to the limit, test for reliability, test for -- and the reason why NVIDIA built NemoTron is so that we could do pretraining post training and now we can do inference. We own all of the software so that we can bring up all of the systems on an annual basis, which basically says you're bringing up all the time. If you don't own everything, you have no shot, 0% chance. People are talking about their new GPU, but where's their scale up fabric coming from? And how is that going to work? And that's just -- I just gave you 2 examples. That whole agentic system that we were talking about earlier, that's the future computer. And so that's really what we -- the company's organization, the company's mission, the company's capabilities are all aligned to me delivering the promise that I just delivered to the marketplace. And that's why we're able to keep doing it. A PowerPoint slide is not going to deliver that system. And a PowerPoint slide with 2 bar charts is not going to convince somebody to give you $50 billion. It doesn't make any sense. And to engineer it all into existence inside the data center, by the time that you bring it up, we're already 2 clicks down the road. So this is the pace that we put the whole industry on, and it is, frankly, extremely, extremely hard. And we could do it, but that's because of all the things that I just described.
You also know that every one of our systems is CUDA compatible. So on day 1, I've got yesterday software that runs perfectly on this one. I own all the scale-up switch, I own all the scale-out switch. I own all the software, do I not? So on day 1, I take yesterday's software and put it on the new system. If it doesn't work, what's the point? Then once we get everything brought up because we own all the software stack, then we can take it to the limit. And so having CUDA compatibility, we have this thing called DOCA compatibility. We own all the compilers, we own all the software stack, really, really important.
You can't outsource that to other people. Somebody else is building it on your behalf. That is how do you bring up a system. They're not going to bring up your system for you. They're not going to qualify for you. And so. That's it. Can we take 1 more question? Is that okay? Can you guys tolerate one more question? I'm enjoying it so much. Let me just -- somebody is going to ask me a question when I had to choose the precise word hair or did he say hair or a hare. That's materially different.
Thank you for extending the session and squeezing me in. Jensen, I just want to clarify 1 thing.
Here it comes. Oh dear, I changed my mind. I changed my mind. Everybody have a good GTC.
Quick clarification. Does the $1 trillion plus include Rubin Ultra or not? And my question is...
No, I got to stop you right there. Thank you. No, no. Absolutely not. And yes, absolutely not.
Okay. My question is we talked a lot about inferencing at this event. I just -- I was hoping that you could spend a couple of minutes on training in terms of how do you see the compute intensity growing? What will drive in your view over the next few years? Is it still the larger and larger models? Or is there something else on the horizon that you see? And I guess if you take a 3- to 5-year view, what's your view on training versus inferencing mix in terms of compute demand.
Training went from pretraining to post-training, pre-training is basically memorization, memorization and generalization. The more -- the more you memorize and generalize, the better foundation you have. Once you have that foundation, that's why it's called pretraining. It's kind of like AI kindergarten, okay? It's more than kindergarten, but AI high school. And so now you have the pretraining. You have the basic vocabulary and grammar and a lot of hidden reasoning capability that when I teach your new skills, you'll even understand it. So now when I tell you to go solve a math problem or right code or try to write code, you actually understood what I meant. If you don't even understand what I meant, how can you possibly even attempt to doing it. And so -- so pre-training does that. Post-training teaches all kinds of skills, okay? And reinforcement learning reinforcement learning with executable grounding, reinforcement learning, verifiable feedback, a whole bunch of technology techniques for batch-oriented reinforcement learning, tool use. I mean, the list goes on and on, okay? Structured-based APIs, unstructured based tool use. I mean there's just -- there's a whole lot of domains. And that part, computing intensity, I'm going to guess, probably million times, more than pretraining. I'm probably off by a factor of 1.2, but it's a lot. And the reason for that is because there's a lot of skills to go learn and all these skills, the rollout is really, really long. And so the models have to get larger and larger. When you get good at these, when you got at these, you take all of that synthetic data and some of it, you're going to push back to pretraining next time. And so yesterday's pretraining start all from Internet data.
Today's pretraining is mostly Internet data. In a couple of generations, pretraining will be mostly synthetic data. Meanwhile, you're adding multimodality to it. Meanwhile, you're adding motion to it. Long rollout physical actions to it. And the reason for that is because there's a lot of common sense that's cognitively logic related that if you were able to interact in the physical world, you could deal with that concept a lot easier even in the abstract world, okay? Because you actually have grounded experience in the physical world. And so notice the amount of computation that I just described, we're 1 million -- 1 billion times future amount of computing necessary for training. And then after that, continuous learning. So almost everybody's model will be lastly trained, fine-tuned so that it could also be memorized and generalized per person. And so in the future, basically, where inference starts and ends and where training starts and ends will come blurrier and blurrier. Just kind of when are you learning and when are you applying your wisdom? Well, in most people's cases is continuous now. And so I think that kind of gives you the 3 phases of it.
And with respect to inference versus training, let me tell you my hope. My hope is that 99% of the world's compute goes towards inference. And the reason for that is because inference is where we translate tokens generated to economics. Nobody pays you for learning. Nobody pays for training, you pay for training. I want the world to be able to use these tokens for valuable outcome, impactful outcome for health care, for manufacturing, for financial services or -- right? For engineering, for right, you name it, isn't that right? And so we want the world that's our hope is that 99% -- and if our dreams come true, 100% of the future tokens are going towards economic benefits while the AI models are learning.
And so -- it's -- there's a really good reason why NVIDIA went all in on inference last year. And the reason for that is because we see this future where inference in training and pretraining and learning and all that is just 1 big continuum. It's not as of -- go back and read 2 years ago, the story is people write, NVIDIA really good at training, inference is easy any company could do that. And therefore, do you guys remember that. Inference is super hard. Look at this chart. It's super hard, it's going to get way harder Inference is thinking, it's working, it's doing things. How could that be easy? I thought my life was easy pre high school, not post-high school. Pre-high score super hard. After that, it was -- after that was super hard. And so I think people just got it all completely backwards. And they just -- they wanted to make up stories that rationalized their opportunity, which is fine. But you had a reason about it from first principles. And I take a long time answering questions for you guys instead of a short, highly curated, super well-selected precisely adjusted verbs announced. And the reason for that is because I want you guys to learn how to reason through these things. So when you see it yourself, you go, no, that's not making sense or if that makes sense or we could -- because you're analysts, you need to be able to understand these things.
Okay. All right, guys. Thank you very much. Thanks for coming to GTC.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Shareholder/Analyst Call - NVIDIA Corporation
NVIDIA — Shareholder/Analyst Call - NVIDIA Corporation
Überblick
NVIDIA präsentiert auf der GTC eine strategische Perspektive zu Agentic Systems, Open Claw und der Token-Ökonomie. Im Fokus stehen langfristige Wachstumsziele, Infrastruktur-Aufbau und eine starke Rolle der NVIDIA-Plattform, jedoch werden keine klassischen Quartalszahlen genannt.
Wichtige Kennzahlen
- Konkrete Umsatz-, Gewinn- oder Margenzahlen sowie EPS-Veränderungen zum Vorjahr/Vorquartal werden im Transkript nicht genannt.
- Hauptkennzahl ist die angegebene Sichtbarkeit von über 1 Billion USD für Blackwell plus Rubin bis zum Ende 2027 (mit fortlaufendem Wachstum bekräftigt).
Strategische Ausrichtung
- Dritter AI-Inflection-Point: Agentic Systems, die autonom Aufgaben erfüllen können; Implementierung über Open Claw/Open-Source-Fork Nemo Claw-Ansatz; Fokus auf tokens-basiertes Geschäftsmodell und Ganzheitlichkeit der AI-Factory.
- Betonte Positionierung als Plattformanbieter: CUDA-Ökosystem, Open-Modelle, enge Zusammenarbeit mit hyperscalern, aber gleichzeitige Ansprache von regionalen Clouds, Enterprise-On-Prem und ODMs.
- Produkt- und Architekturstrategie: Vera Rubin als zentrale Architektur, Grok als Ergänzung für last-mile-Throughput; NVLink-/CPO-Optionen; Integration von Speicher (HBM, LPDDR5, SRAM) und CPU-Komponenten; Open Claw als „Betriebssystem“ der Architektur.
Ausblick & Guidance
Ausblick kennzeichnet starke Nachfrage und fortlaufende Infrastruktur-Ausbau; es wird von einer fortlaufenden Demand-Entwicklung ausgegangen, die über das bisher Besprochene hinauswachsen soll. Finanzielle Richtgrößen in Form von Kapital-Rückfluss sind vorgesehen: Es wird angekündigt, dass 50% des freien Cashflows für Aktienrückkäufe und Dividenden genutzt werden könnten; weitere Opportunitäten bleiben offen, abhängig von Verpflichtungen in der ersten Jahreshälfte.
Analystenfragen
- Frage: Upside bei den Umsätzen der Hyperscaler und ob das „Token-Ökosystem“ die Revenues signifikant antreiben kann; wie schnell sich ein solches Potenzial realisiert? Antwort: Huang betont, dass die IT-Industrie transformiert wird, mit wachsender Token-Nachfrage und Open-Modelle; das Geschäftsmodell verschiebt sich von Lizenzen zu Token-Generierung, was zu höheren Umsätzen führen könnte, sofern Token-Volumen und Durchsatz steigen.
- Frage: Rubin vs. Grok – Zeitplan, Lieferumfang und Auswirkungen auf Margen; Rubin shippt vor Grok; wie verändert sich das Kosten-Nutzen-Profil? Antwort: Rubin bleibt führend; Grok ergänzt, um last-mile-Throughput zu erhöhen; Architektur zielt darauf ab, Tokensegmente zu erweitern und Preise zu differenzieren.
- Frage: Margen-Sustainability – wie stabil bleiben Margen bei der zunehmenden Token-basierten Struktur? Antwort: der Vortrag betont Token-Per-Second-Wert sowie Segmentierung; längerfristig soll die Margen-Performance durch steigende Durchsätze und Token-Volumen unterstützt werden.
NVIDIA — NVIDIA GTC AI Conference 2026
1. Management Discussion
Welcome to the stage, NVIDIA Founder and CEO, Jensen Huang.
Welcome to GTC. I just want to remind you, this is a tech conference. All these people are lining up so early in the morning, all of you in here, it's great to see you. GTC. GTC, we're going to talk about technology. We're going to talk about platforms. NVIDIA has 3 platforms. You think that we mostly talk about one of them. It's related to CUDA-X. Our systems is another platform. And now we have a new platform called AI factories. We're going to talk about all of them. And most importantly, we're going to talk about ecosystems.
But before I start, let me thank our pregame show hosts. I thought they did a great job. Sara Guo of Conviction; Alfred Lin, Sequoia Capital, NVIDIA's first venture capitalist. Gavin Baker, NVIDIA's first major institutional investor. These 3 people are deep in technology, deep in what's going on. And of course, they have just a really broad reach of technology ecosystem. And then, of course, all of the VIPs that I hand selected to join us today, All-Star team. I want to thank all of you for that.
I also want to thank all the companies that are here. NVIDIA, as you know, is a platform company. We have technology, we have our platforms. We have rich ecosystem. And today, there are probably 100% of the $100 trillion of industry here, 450 companies sponsored this event. I want to thank you, 1,000 technical sessions, 2,000 speakers. This is -- this conference is going to cover every single layer of the 5-layer cake of artificial intelligence from land, power and shell, the infrastructure to chips to the platforms, the models and of course, the most important and ultimately, what's going to take -- get this industry taken off is all of the applications.
But it all began here. This is the 20th anniversary of CUDA. We've been working on CUDA for 20 years. For 20 years, we've been dedicated to this architecture, this revolutionary invention SIMT, single instruction, multi-threaded, writing scaler code could spun off into multi-threaded application, much, much easier to program than SIMD. We recently added tiles so that we could help people program Tensor Cores and the structures of mathematics that are so foundational to artificial intelligence today. thousands of tools and compilers and frameworks and libraries in open source, there's a couple of hundred thousand public projects. CUDA literally is integrated into every single ecosystem.
This chart basically describes 100% of NVIDIA's strategies. You've been watching me talk about this slide from the very beginning. And ultimately, the single hardest thing to achieve is the thing on the bottom, installed base. It has taken us 20 years to now have built up hundreds of millions of GPUs and computing systems around the world that run CUDA. We are in every cloud, we're in every computer company. We serve just about every single industry. The installed base of CUDA is the reason why the flywheel is accelerating. The installed base is what attracts developers who then creates new algorithms that achieves a breakthrough. For example, deep learning. There are so many others. Those breakthroughs leads to entirely new markets, which builds new ecosystems around them with other companies that join, which creates a larger installed base.
This flywheel is now accelerating. The number of downloads of NVIDIA libraries is incredibly accelerating. It's at a very large scale and growing faster than ever. This flywheel is what makes this computing platform able to sustain so much applications, so many new breakthroughs. But most importantly, it also enables these infrastructures to have extraordinarily useful life. And the reason for that is very obvious. There are so many applications that you can run on NVIDIA CUDA. We support the entire -- every single phase of the AI life cycle. We address every single data processing platform. We accelerate scientific principal solvers of all different kinds. And so the application reach is so great that once you install NVIDIA GPUs, the useful life of it is incredibly high. It is also one of the reasons why Ampere that we shipped them some 6 years ago, the pricing of Ampere in the cloud is going up.
And so all of that is made possible fundamentally because the installed base is high, the flywheel is high, the developer reach is great. And when all of that happens, and we continuously update our software, the computing cost declines. The combination of accelerated computing speeding up applications tremendously. Meanwhile, as we continue to nurture and continue to update software over its life, not only do you get the first-time pop, you get the continuous cost reduction of accelerated computing over time. And we're willing to nurture, willing to support every single one of these GPUs in the world because they're all architecturally compatible. We're willing to do so because the installed base is so large, if we release a new optimization, it benefits millions. This applies to everybody in the world.
This combination of dynamics is what makes the NVIDIA architecture expand its reach, accelerating its growth, at the same time, driving down computing cost, which ultimately encourages new growth. So CUDA is at the center of it. But our journey to CUDA actually started 25 years ago. GeForce, I know how many of you grew up with GeForce. GeForce is NVIDIA's greatest marketing campaign. We attract future customers starting long before you could afford to pay for it yourself. Your parents paid. Your parents paid for you to be NVIDIA customers. And every single year, they paid up year after year after year until someday you became an amazing computer scientist and became a proper customer, a proper developer. But this is the house that GeForce made.
25 years ago, we started our journey, which led to CUDA. 25 years ago, we invented the programmable shader, a perfectly unobvious invention to make an accelerator programmable, the world's first programmable accelerator, the pixel shader. 25 years ago that led us to explore further and further 20 years later, 5 years later, the invention of CUDA. One of the biggest investments that we made, and we couldn't afford it at the time and it consumed the vast majority of our company's profits was to take CUDA on the backs of GeForce to every single computer. We dedicated ourselves to create this platform because we felt so much -- we felt so strongly about its potential. But ultimately, the company's dedication to it despite the hardships in the beginning, believing it every single day for 13 generations or 20 years, we now have CUDA installed everywhere. The pixel shader led to, of course, the revolution of GeForce.
And then 10 years ago, we introduced -- about 10 years ago, what is it, 8 years ago, we introduced RTX, a complete redesign of our architecture for the modern era of computer graphics. GeForce brought CUDA to the world. GeForce, therefore, enabled Alex Krizhevsky and Ilya Sutskever and Geoff Hinton, Andrew Ng and so many others to discover that the GPU could be their friend in accelerating deep learning. It started the big bang of AI. 10 years ago, we decided that we would fuse programmable shading and introduced 2 new ideas, ray tracing, hardware ray tracing, which is incredibly hard to do and a new idea at the time, imagine about 10 years ago, we thought that AI would revolutionize computer graphics. Just as GeForce brought AI to the world, AI is now going to go back and revolutionize how computer graphics is done all together.
Well, today, I'm going to show you something of the future. This is our next generation of graphics technology. We call it neural rendering, the fusion of 3D graphics and artificial intelligence. This is DLSS 5. Take a look at it.
[Presentation]
Is that incredible? Computer graphics comes to life. Now what did we do? We fused controllable 3D graphics, the ground truth of virtual worlds, the structured data. Remember this word, the structured data of virtual worlds, of generated worlds. We combine 3D graphics, structured data with generative AI, probabilistic computing. One of them is completely predictive, the other one, probabilistic, yet highly realistic. We combine these 2 ideas, controlled through structured data, controlled perfectly and yet generating at the same time. And as a result, the content is beautiful, amazing as well as controllable. This concept of fusing structured information and generative AI will repeat itself in one industry after another industry after another industry. Structured data is the foundation of trustworthy AI.
Well, this is going to scare you a little bit. I'm going to flip the slide and don't gasp. So we're going to go through the schematic for the rest of the time. This is my best slide. Every time I asked the team, what's my best slide? Repeatedly, this was it. They say, don't do it, Jensen, don't do it. I said, no. These seats are free for some of you. So this is your price of mission. So this is structured data. You've heard of it, SQL, Spark, pandas, Velox, some of these really, really important, very large platforms, Snowflake, Databricks, EMR, Amazon EMR, Azure Fabric, Google Cloud, BigQuery. All of these platforms are processing data frames. These data frames are giant spreadsheets, and they hold all of life's information. This is the structured data, the ground truth of business. This is the ground truth of enterprise computing.
Well, now we're going to have AI use structured data. And we better accelerate the living daylights out of it. It used to be okay, and we would -- of course, we would accelerate structured data so that we could do more, we could do it more cheaply, we could do it more frequently per day and keep the company running at a much more synchronized way. However, in the future, what's going to happen is these data structures are going to be used by AI, and AI is going to be much, much faster than us. Future agents are going to use structured databases as well. And then, of course, the unstructured database, the generative database. This database represents the vast majority of the world, vector databases, unstructured data, PDFs, videos, speeches, all of the world's information, about 90% of what's generated every single year is unstructured data.
Until now, this data has been completely useless to the world. We read it. We put it into our file system, and that's it. Unfortunately, we can't query it. We can't search for it. It's hard to do that. And the reason for that is because there's no easy indexing of unstructured data. You have to understand its meaning, its purpose. And so now we have AI do that. Just as AI was able to solve multi-modality perception you can -- and understanding, you can use that same technology, multimodality perception and understanding to go read a PDF to understand its meaning. And from that meaning, embed it into a larger structure that we can search into, we can query into.
NVIDIA created 2 foundational libraries, just like we created RTX for 3D graphics. We created cuDF for data frames, structured data. We created cuVS for vector stores, semantic data, unstructured data, AI data. These 2 platforms are going to be 2 of the most important platforms in the future. Super excited to see its adoption throughout the network, this complicated network of the world's data processing systems. And the reason for that is because data processing has been around a long time. And therefore, so many different companies and platforms and services, it has taken us a long time to integrate deeply into this ecosystem. I'm super proud of the work that we're doing here.
And then today, we're announcing several of them. IBM, the inventor of SQL, One of the most important domain-specific languages of all time is accelerating watsonx.data with cuDF. Let's take a look at it.
[Presentation]
NVIDIA accelerates data processing in the cloud. We also accelerate data processing on-prem. As you know, Dell is the world-leading computer systems maker, and they also are one of the world's leading storage providers. And they worked with us to create the Dell AI data platform that integrates cuDF and cuVS to create an accelerated data platform, well, for the era of AI. And this is an example of what they did with NTT DATA, huge speed up. This is cloud -- Google Cloud. And Google Cloud, as you know, we've been working with Google Cloud for a very long time. We accelerate Google's Vertex AI. We now accelerate BigQuery, really important framework and really important platform. And this is an example of our work together with Snapchat, where we reduced their cost of computing by nearly 80%.
When you accelerate data processing, when you accelerate computing, you get the benefit of speed, you get the benefit of scale. But most importantly, you also get the benefit of cost. And so all of those come together as one. It was originally called Moore's Law. Moore's Law was about getting performance doubling every couple of years. It's another way of saying, so long as the price remains about the same and most computers remained about the same, you're also getting twice the performance every year or you're reducing the cost of computing every single year. Well, Moore's Law has run out of steam. We need a new approach. Accelerated computing allows us to take these giant leaps forward.
And as you will see later, because we continue to optimize the algorithms, and NVIDIA is an algorithm company. As we continue to optimize the algorithms and because our reach is so large and our installed base is so large, we can reduce the computing cost, increasing the scale, increasing the speed for everybody continuously. This is Google Cloud. You can see this pattern I just mentioned. I just wanted to show you 3 versions of it. NVIDIA built the accelerated computing platform, has a bunch of libraries on top. I gave you 3 examples. RTX is one of them, cuDF is another, cuVS, and we'll show you a few more. These libraries sit on top of our platform. But ultimately, we integrate into the world's cloud services, into the world's OEMs and together and other platforms that I'll show you, together, we're able to reach the world.
This pattern, NVIDIA, Google Cloud, Snapchat will repeat over and over again and it kind of looks like this. And so this is one example, NVIDIA with Google Cloud. We accelerate Vertex AI. We excel at BigQuery. We accelerate -- I'm super proud of the work that we've done with JAX and XLA. We are incredible on PyTorch. We're the only accelerator in the world that's incredible on PyTorch and incredible on JAX and XLA. And the customers that we support, the Baseten, the CrowdStrike, PUMA, Salesforce, they're not our customers, but their customers, developers of ours that we've integrated the NVIDIA technologies into that we can then land on the cloud.
Our relationship with cloud service providers are essentially us bringing customers to them. We integrate our libraries, we accelerate workloads, and we land those customers in the cloud. And so as you could see, most of our cloud service providers love working with us, and they're always asking us to land the next customer on their cloud. And I just want to let you know, there are a lot of customers. We're going to accelerate everybody. And so there will be lots and lots of customers and we'll be able to land in your cloud. Just be patient with us.
And so this is Google Cloud. This is AWS. We've been working with AWS a long time. And one of the areas -- one of the things I'm super excited about this year is we're going to bring OpenAI to AWS. And so it's going to drive enormous consumption of cloud computing at AWS. It's going to expand the reach and expand the compute of OpenAI. And as you know, they are completely compute constrained. And so AWS, we accelerate EMR, we accelerate SageMaker, we accelerate Bedrock. NVIDIA is integrated really deeply into AWS. They were our first cloud partner, Microsoft Azure. NVIDIA's A100 supercomputer was the first one we built was for NVIDIA. The first one we installed was at Azure. And that led to the big successful partnership with OpenAI. But we've been working with Azure for quite a long time. We accelerate Azure Cloud. Now it's their AI foundry, we partner deeply with. We accelerate Bing Search. We work with them on Azure regions.
This is one of the areas that is incredibly important as we continue to expand AI throughout the world. One of the capabilities that we offer is confidential computing that in confidential computing, you want to make sure that even the operator cannot see your data. Even the operator cannot touch or see your models. Confidential computing NVIDIA's GPUs is the first ones in the world to do that. It's now able to support confidential computing and protected deployment of these very valuable OpenAI models and Anthropic models throughout clouds and different regions and all because of our confidential computing. Confidential computing is super important.
And here's an example where we have different customers that we work with. Synopsys, a great partner of ours, who are accelerating all of their EDA and CAE workflows. And then we landed at Microsoft Azure. We were Oracle's first AI customer. Most people would have thought we were their first supplier. We were their first supplier also, but we were their first AI customer. I'm quite proud of the fact that I explained AI clouds to Oracle for the first time, and we were their first customer. Since then, they've really taken off. We've landed a whole bunch of our partners there, Cohere and Fireworks and of course, very famously, OpenAI. A great partnership with CoreWeave. They're the world's first AI native cloud, a company that was built with only one singular purpose to provision, to host GPUs as the era of accelerated computing showed up and to host for AI clouds. They've got some fantastic customers, and they're growing incredibly.
One of the platforms that I'm quite excited about is Palantir and Dell. The 3 of our companies have made it possible to stand up a brand-new type of AI platform, the Palantir Ontology platform and AI platform. And we could stand up these platforms in any country, in any air-gapped region, completely on-prem, completely on site, completely in the field. AI could be deployed literally everywhere. Without our confidential computing capability, without our ability to build an end-to-end system as well as offer the entire accelerated computing and AI stack from data processing, whether it's vectors or structures, all the way to AI, it wouldn't have been possible.
I wanted to show you these examples. This is our special working relationship with the world's cloud service providers. And many -- well, all of them are here, and I get the benefit of seeing them during booth tour, and it's just so incredibly exciting. I just want to thank all of you for the hard work. What NVIDIA has done is this. And you're going to see this theme over and over again. NVIDIA is vertically integrated, the world's first vertically integrated but horizontally open company. And the reason that's necessary is very simple. Accelerated computing is not a chip problem. Accelerated computing is not a systems problem. Accelerated computing has a missing word. We just never say it anymore. Application acceleration. If I could make a computer run everything faster, that's called a CPU. But that's run out of steam.
The only way for us to accelerate applications going forward and continue to bring tremendous speed up, tremendous cost reduction is through application or domain-specific acceleration. I dropped that phrase in the front, and therefore, it just became accelerated computing. And that is the reason why NVIDIA has to be library after library, domain after domain, vertical after vertical. We are a vertically integrated computing company. There is no other way. We have to understand the applications. We have to understand the domain. We have to understand fundamentally the algorithms, and we have to figure out how to deploy the algorithm in whatever scenario it wants to be deployed, whether it's a data center, cloud, on-prem at the edge on a robotic system, all of those computing systems are different.
And finally, the systems and chips. We are vertically integrated. What makes it incredibly powerful and the reason why you saw all the slides is because NVIDIA is horizontally open. We'll work and integrate NVIDIA's technology into whatever platform you would like us to integrate into. We offer you the software, we offer you libraries. We integrate with your technology so that we can bring accelerated computing to everybody in the world. Well, this GTC is really a great demonstration of that. Most of the time, most of the time, you'll see me talk about these verticals, and I'll use some examples. But in every single case, whether it's automotive -- by the way, financial services, the largest percentage of attendees at this GTC is from the financial services industry. I know, I'm hoping it's developers, not traders.
Guys, here's one thing I wanted to say. And so in the audience represents NVIDIA's ecosystem, upstream of our supply chain and downstream of our supply chain. And we work -- we think about our supply chain upstream and downstream. And it's just so exciting that our entire upstream supply chain this last year, irrespective of whether you're a 50-year-old company, we have 70-year-old companies. We have a 150-year-old company who are now part of NVIDIA's supply chain and partnering with us, either upstream or downstream. And last year, you had your record year. Did you not? Congratulations. We're on to something here. This is the beginning of something very, very big.
And so if you look at accelerated computing, we've now set the computing platform. But in order for us to activate those computing platforms, we need to have domain-specific libraries that solve very important problems in each one of the verticals that we address. You see us addressing every single one of this, autonomous vehicles, our reach, our breadth, our impact, incredible. We have a track on that. Financial services, I just mentioned. Algorithmic trading is going from classical machine learning with human feature engineering called the quants, did that, to now supercomputers studying massive amounts of data, discovering insight and discovering patterns by itself. And so this is going through its deep learning and its transformer moment. Health care is going through their ChatGPT moment, some really exciting work that we're there. We have a great keynote track here. We have a great keynote track Kimberly Powell, a great keynote track for health care. We're talking about AI physics or AI biology for drug discovery, AI agents for customer service and support of diagnosis. And of course, physical AI, robotic systems.
All these different vectors of AI have different platforms that NVIDIA provides. Industrial, we are completely resetting and starting the largest build-out of human history. And most of the world's industries, building AI factories, building chip plants, building computer plants are represented here today. Media and entertainment, gaming, of course, real-time AI platform so that we could translation and broadcast support and live games and live video, enormous amount of it will be augmented with AI. We have a platform called Holoscan. Quantum, there are 35 different companies here building with us the next generation of quantum GPU hybrid systems, retail and CPG, using NVIDIA for supply chain, using -- creating agentic shopping systems. AI agents for customer support, a lot of work being done here, $35 trillion industry. Robotics, $50 trillion industry in manufacturing.
NVIDIA has been working in this area for a decade now, building 3 computers, the fundamental computers necessary to build robotic systems. We are integrated with, working with literally every single company that we know of building robots. We have 110 robots here at the show and then telecommunications. About as large as the world's IT industry, about $2 trillion. We see, of course, base stations everywhere. It's one of the world's infrastructures. It was the infrastructure of the last generation of computing. That infrastructure is going to get completely reinvented.
And the reason for that is very simple. That base station, which is -- it does one thing, which is base station, is going to be an AI infrastructure platform in the future. AI will run at the edge. And so lots and lots of great discussion there. And our platform there is called Aerial, our AI RAN, big partnership with Nokia, big partnership with T-Mobile and many others.
At the core of our business, everything that I just mentioned, computing platforms, but very importantly, our CUDA-X libraries. Our CUDA-X libraries is the algorithm, the algorithms that NVIDIA invents. We are an algorithm company. That's what makes us special. That's what makes it possible for me to be able to go into every single one of these industries, imagine the future and have the world's best computer scientists describe and solve problems, refactor it, reexpress it and turn it into a library. We have so many. I think we have -- at this show, we are announcing 100 libraries, 70 libraries, maybe 40 models. And that's just at the show. We're updating these all the time. We're updating them all the time.
The libraries is the crown jewels of our company. It is what makes it possible for that platform, the computing platform to be activated in service of solving a problem, making impact. One of the biggest -- one of the most important libraries that we ever created, cuDNN, CUDA deep neural networks. It completely revolutionized artificial intelligence, caused a big bang of modern AI. Let me show you a short video about CUDA-X.
[Presentation]
Everything you saw was a simulation. Some of it was principal solvers. Fundamental physics solvers. Some of it was AI surrogates, AI physical models and some of it was physical AI robotics models. Everything was simulated. Nothing was animated, nothing was articulated. Everything was completely simulated. That is what fundamentally NVIDIA does. It is through the connection of understanding of the algorithms with our computing platforms that we're able to open up to unlock these opportunities. NVIDIA is a vertically integrated computing company with open horizontal integration with the world. So that's CUDA-X.
Well, just now you saw a whole bunch of companies. You saw Walmart and there's L'Oreal and incredible companies, established companies, JPMorgan and Roche. And these are companies that have defined society to today. Toyota is here. These are some of the largest companies in the world. It is also true that there's a whole bunch of companies you've never heard of. These are companies, we call them AI natives, a whole bunch of small companies. The list is gigantic. I couldn't -- this is just a little tiny bit of it. And I couldn't decide whether to show you more or show you less. And so I made it so that you couldn't see any. And nobody's feelings are hurt. However, inside this list are a bunch of brand-new companies.
There are companies like, for example, you might have heard a couple of them, OpenAI, Anthropic, but there's a whole bunch of others. There's a whole bunch of others, and they serve different verticals. Something happened in the last 2 years, particularly this last year. We've been working with the AI natives for a long time. And this last year, it just skyrocketed. I'll explain to you why it happened. This industry has skyrocketed, $150 billion of investment into venture investment into start-ups, the largest in human history. This is also the first time that the scale of the investments went from millions of dollars, tens of millions of dollars to hundreds of millions of dollars and billions of dollars.
And the reason for that is this is the first time in history that every single one of these companies needs compute and lots and lots of it. They need tokens, lots and lots of it. They're either going to create and build and create tokens and generate tokens or they're going to integrate, add value to tokens that are available created by Anthropic and OpenAI and others. And so this industry is different in so many different ways, but the one thing that is very clear, the impact that they're making, the incredible value that they're delivering already is quite tangible. AI natives, all because we reinvented computing.
Just like during the PC revolution, a whole bunch of new companies were created. Just as during the Internet revolution, a whole bunch of companies were created in the mobile cloud, a whole bunch of companies were created. Each one of them had their own standards, and we're talking about one of the major standards that just happened, incredibly important. And this generation, we also have our own large number of very, very special companies. We reinvented computing. It stands to reason there's going to be a whole new crop of really important companies, consequential companies for the future of the world. The Googles, the Amazons, the Metas, consequential companies that have come as a result of the last computing platform shift. We are now at the beginning of a new platform shift.
But what happened in the last couple of years? Well, we've been watching, as you know, we've been working on deep learning and working on AI, the big bang of modern AI, we were right there at the spot, and we've been advancing this field for quite some time. But why the last 2 years? What happened in the last 2 years? Well, 3 things. ChatGPT, of course, started the generative AI era. It's able to not just understand, perceive and understand, it's able to also translate and generate generation of unique content. I showed you the fusion of generative AI with computer graphics and it brought computer graphics to life. You guys -- just everybody in the world should be using ChatGPT. I know I use it every single morning, use to planning this morning. And so ChatGPT was the generative AI era.
The second, by the way, generative computing, versus the way we used to do computing. It's not -- it's generative AI is a capability of software, but it has profoundly changed how computing is done. Computing used to be retrieval-based. Now it's generative. Keep that thought in mind when I talk about certain things, and you'll realize why it is that everything that we do is going to change how computers are architected, how computers are provided, how computers are going to be built out and what is the meaning of computing altogether. Generative AI, 2023, end of '22, 2023.
The next reasoning AI, o1, which -- and then took off with o3. Reasoning allowed it to reflect, allows it to think to itself, allowed it to plan, break down problems and decompose a problem it couldn't understand into steps or parts that it could understand. It could ground itself on research. o1 made generative AI trustworthy and grounded on truth. That caused ChatGPT to simply took off. And that was a very, very big moment. The amount of input tokens that was necessary in order to produce and the amount of output tokens it generated in order to reason, the model was a little bit larger. Of course, you could have much larger models. o1 was a little bit larger, not much larger, but it's input token usage for context and it's output token for thinking increased the amount of computation tremendously.
Then came Claude Code, the first agentic model. It was able to read files, code, compile it, test it, evaluate it, go back and iterate on it. Claude Code has revolutionized software engineering, as all of you know. 100% of NVIDIA is using a combination of -- or oftentimes all 3 of them, Claude Code, Codex and Cursor all over NVIDIA. There's not one software engineer today who is not assisted by one or many AI agents helping them code. Claude Code completely revolutionized. It's the new inflection. And for the first time, you don't ask AI, what, where, when, how. You ask it create, do, build. You ask it to use tools, take your context, read files, it's able to agentically break down a problem, reason about it, reflect on it. It's able to solve problems and actually perform tasks.
An AI that was able to perceive became an AI that could generate. An AI that could generate became an AI that could reason. An AI that could reason now became an AI that can actually do work, very productive work. The amount of computation in the last 2 years, we know that everybody in this room knows the computing demand for NVIDIA GPUs off the charts. Spot pricing is skyrocketing. You couldn't find a GPU if you tried and yet in the meantime, we're shipping GPUs out. Incredible amounts of it and demand just keeps on going up. There's a reason for that. This fundamental inflection.
Finally, AI is able to do productive work, and therefore, the inflection point of inference has arrived. AI now has to think. In order to think, it has to inference. AI now has to do. In order to do, it has to inference. AI has to read. In order to do so, it has to inference. It has to reason. It has to inference every part of AI every time it has to think, it has to reason, it has to do. It has to generate tokens. It has to inference. It's way past training now. It's in the field of inference. So the inference inflection has arrived. At the time when the amount of tokens, the amount of compute necessary increased by roughly 10,000 times. Now when I combine these 2, the fact that since in the last 2 years, the computing demand of the work has gone up by 10,000 times. And the amount of usage has probably gone up by 100x. People have heard me say, I believe that computing demand has increased by 1 million times in the last 2 years.
It is the feeling that we all have. It is the feeling every startup has. It's the feeling that OpenAI has, it's the feeling that Anthropic has. If they could just get more capacity, they could generate more tokens, their revenues would go up. More people could use it, the more advanced, the smarter the AI could become. We are now at that positive flywheel system. We have reached that moment. The inflection -- the inference inflection has arrived. Last year, at this time, I said that where I stood at that moment in time, we saw about $500 billion. We saw $500 billion of very high confidence demand and purchase orders for Blackwell and Rubin through 2026. I said that last year.
Now I don't know if you guys feel the same way, but $500 billion is an enormous amount of revenue. Not one impressed. I know why you're not impressed because all of you had record years. Well, I'm here to tell you that right now, where I stand, a few short months after GTC D.C., 1 year after last GTC, right here where I stand, I see through 2027, at least $1 trillion. Now does it make any sense? And that's what I'm going to spend the rest of the time talking about. In fact, we are going to be short. I am certain computing demand will be much higher than that. And there's a reason for that.
So the first thing is we did a lot of work in the last year. Of course, as you know, 2025 was NVIDIA's year of inference. We wanted to make sure that not only were we good at training and post training, that we were incredibly good at every single phase of AI so that the investments that were made, investments made in our infrastructure could scale out for as long as they would like to use it and the useful life of NVIDIA's infrastructure would be long, and therefore, the cost would be incredibly low. The longer you could use it, the lower the cost. There's no question in my mind, NVIDIA systems are the lowest cost infrastructure you could get for AI infrastructure in the world. And so the first part was last year was all about AI for inference, and it drove this inflection point.
Simultaneously, we were very pleased last year that Anthropic has come to NVIDIA, that MSL, Meta SL has chosen NVIDIA. And meanwhile, as a collection, as a group, this represents 1/3 of the world's AI compute, open source models. Open source models have reached near the frontier, and it is literally everywhere. And NVIDIA, as you know, today, we're the only platform in the world today that runs every single domain of AI across every single one of these AI models in language, in biology, in computer graphics, computer vision, in speech, proteins and chemicals, robotics and otherwise, edge of cloud, any language. NVIDIA's architecture is fungible for all of that, and we're incredible for all of that.
That allows us to be the lowest cost, the highest confidence platform because when you're building these systems, as I mentioned, $1 trillion is an enormous amount of infrastructure. You have to have complete confidence that the $1 trillion you're putting down will be utilized, would be performant, would be incredibly cost effective and have useful life for as long as you could see. That infrastructure investment you could make on NVIDIA, you could make with complete confidence. We have now proven that. It is the only infrastructure in the world that you could go anywhere in the world and build with complete confidence. You want to put it in any of the clouds, we're delighted by that. You want to put it on-prem, we're happy about that. You want to put it in any country anywhere, we're delighted to support you. We are now a computing platform that runs all of AI.
Now our business already starting to show that. 60% of our business is hyperscalers, the top 5 hyperscalers. However, even within that top 5 hyperscalers, some of it is internal AI consumption. The internal AI consumption, really important work, like REXUS is moving from recommender systems of tables and collaborative filtering and content filtering. It's moving towards deep learning and large language models. Search, moving to deep learning, large language models. Almost all of these different hyperscale workloads are now moving, shifting towards a workload that NVIDIA GPUs are incredibly good at. But on top of that, because we work with every AI lab, because we work with every -- we accelerate every AI model and because we have a large ecosystem of AI natives that we work with that we can bring to the clouds. That investment, no matter how large, no matter how quick that compute will be consumed. And that represents 60% of our business.
The other 40% is just everywhere. Regional clouds, sovereign clouds, enterprise, industrial, robotics, edge, big systems, supercomputing systems, small servers, enterprise servers, the number of systems, incredible. The diversity of AI is also its resilience. The span of reach of AI is its resilience. There is no question this is not a one-app technology. This is now fundamental. This is absolutely a new computing platform shift.
Well, our job is to continue to advance the technology. And one of the most important things that I mentioned last year was last year was our year of inference. We dedicated everything. We took a giant chance and reinvented while Hopper was at its prime and it was just cooking, we decided that the Hopper architecture, the [ NVLink x8 ] had to be taken to the next level. We completely re-architected the system, disaggregated the computing system altogether and created NVLink 72. The way that it's built, the way it's manufactured, the way it's programmed completely changed. Grace Blackwell, NVLink 72 was a giant bet. And it wasn't easy for anybody. And many of my partners here in the room, I want to thank all of you for the hard work that you guys did. Thank you.
NVLink 72, NVFP4, not just FP4 precision, FP4 is a whole different type of Tensor Core and computational unit. We've demonstrated now that we can inference NVFP4 without loss of precision, but gigantic boost in performance and energy efficiency. We've also been able to use NVFP4 for training. So NVLink 72, NVFP4, the invention of Dynamo, TensorRT-LLM, a whole bunch of new algorithms. We even built a supercomputer to help us optimize kernels and help us optimize our complete stack. We call it DGX Cloud. We invested billions of dollars of supercomputing capability to help us create the kernels, the software that made inference possible.
Well, the results all came together and people used to tell me, but Jensen, inference is so easy. Inference is the ultimate hard. Inference is ultimate hard, and it's also ultimate important because it drives your revenues. And so this is the outcome. This is from SemiAnalysis, this is the largest, most comprehensive suite of AI inference that has ever been done. And what you see here on the left on this side is tokens per watt. Tokens per watt is important because every data center, every single factory, by definition, is power constrained. A 1 gigawatt factory will never become 2. It's physically constrained, the laws of atoms, the laws of physicality. And so that 1 gigawatt of data center, you want to drive the maximum number of tokens, which is the production, the product of that factory. So you want that -- you want to be on top of that curve as high as you want.
This -- the X-axis is the interactivity, the speed of inference, the speed of each inference. The faster you can inference, the faster you could, of course, respond. But very importantly, the faster you can inference, the larger the models, the more context you could process, the more tokens you can think through, this axis is the same as smartness of the AI. And so this is the throughput of the AI. This is the smartness of the AI. Notice the smarter the AI, the lower your throughput makes sense. You're thinking longer, okay? And so this axis is the speed, and I'm going to come back to this. This is important. This is where I torture all of you, but it's too important. Every CEO in the world, you watch, every CEO in the world will study their business from now on in the way I'm about to describe. Because this is your token factory. This is your AI factory. This is your revenues. There's no question about that going forward.
And so this is the throughput. This is the intelligence. Better perf per watt for a given power of data center, the more throughput, the more tokens you could produce. On this side is cost. Notice, NVIDIA is the highest performance in the world. Nobody would be surprised by that. They would be surprised by the fact that in one generation, whereas Moore's Law would have given us through transistors, 50%, 2x, Moore's Law would probably give us 1.5x more performance. You would have expected from Hopper H200, 1.5x higher. Nobody would have expected 35x higher. I said last year, at this time that NVIDIA's Grace Blackwell, NVLink 72 was 35x perf per watt. Nobody believed me. And then SemiAnalysis came out and Dylan Patel had a quote, he accused me of sandbagging. He accused me of sandbagging. He says Jensen sandbagged. It's actually 50x, and he's not wrong. He's not wrong.
And so our cost per token is the lowest in the world. You can't beat it. I've said before, if you have the wrong architecture, even if it's free, it's not cheap enough. And the reason for that is because no matter what happens, you still have to build a gigawatt data center. You still have to build a gigawatt factory. And that gigawatt factory for 15 years amortized across that gigawatt factory is about $40 billion. Even when you put nothing on, it's $40 billion in. You better make for darn sure, you put the best computer system on that thing so that you could have the best token cost. NVIDIA's token cost is world-class, basically untouchable at the moment. And the reason that's true is because of an extreme co-design. And so I'm very happy that he named us -- there was a Monkey King, token king.
Well, we take all of our software. As I told you, we vertically integrate, but we horizontally open. We're vertical integration, horizontal open. We integrate all of our software and all of our technology, however we could package it up and integrate it into the world's inference service providers. And these companies are growing so fast. They're growing so fast. Fireworks, Lynn is here, together, they're just growing so incredibly fast. 100x in the last year. They are token factories. And the effectiveness, the performance and the token cost production capability for their factories is everything to them. And this is what happened. This is -- we updated their software, same system and notice their token speeds, incredible. The difference before NVIDIA updated everything and all of our algorithms and software and all the technology that we bring to bear, about 700 tokens per second average went to nearly 5,000, 7x higher. And so this is the incredible power of extreme codesign.
I mentioned earlier the importance of factories. This is the importance of factory. Your data center, it used to be a data center for files. It's now a factory to generate tokens. Your factory is limited no matter what. Everybody is looking for land, power and shell. Once you build it, you are power limited. Within that power limited infrastructure, you better make for darn sure that your inference because you know inference is your workload, and tokens is your new commodity, that compute is your revenues that you want to make sure that the architecture is as optimized as you can. In the future, every single CSP, every single computer company, every single cloud company, every single AI company, every single company period are going to be thinking about their token factory effectiveness. This is your factory in the future.
And the reason why I know that is because everybody in this room is powered by intelligence. And in the future, that intelligence will be augmented by tokens. So let me show you how we got here.
[Presentation]
Now in the good old days, when I would say, Hopper, I would hold up a chip. That's just adorable. This is Vera Rubin. When we think Vera Rubin, we think the entire system, vertically integrated completely with software, extended end-to-end, optimized as one giant system.
The reason why it's designed for agentic systems is very clear because agents, of course, the most important workload is it's thinking the large language model. The large language models are going to get larger and larger and larger. It's going to generate more and more tokens more quickly, so it could think more quickly, but it also has to access memory.
It's going to pound on memory really hard. KV Cache, structured data, cuDF, unstructured data, cuVS. It's going to be pounding on the storage system really, really hard, which is the reason why we reinvented the storage system. It is also going to use tools. And unlike humans that are more tolerant to slower computers, AI wants the tools to be as fast as possible.
These tools, web browsers in the future, they could also be virtual PCs in the cloud. Those PCs have to be -- and those computers have to be as fast as possible. We created a brand-new CPU, a brand-new CPU that's designed for extremely high single-threaded performance, incredibly high data output, incredibly good at data processing and extreme energy efficiency.
It is the only data center CPU in the world that uses LPDDR5 and incredible single-thread performance and performance per watt that is unrivaled. And so that's -- we built that so that it could go along with the rest of these racks for agentic processing. And so here it is. This is the Grace Blackwell -- no, Vera Rubin, where is it? Here it is, okay? So this is the Vera Rubin system. Notice since the last time, 100% liquid cooled. All of the cables gone.
What used to take 2 days to install, now takes 2 hours, incredible. And so the manufacturing cycle time is going to dramatically reduce. This is also a supercomputer that is cooled by -- it's cooled by hot water, 45 degrees, which takes the pressure off of the data center, takes all of that cost and all of that energy that's used to cool the data center and makes it available for the system. This is the secret sauce.
It is the only -- we're the only company in the world that has today built the sixth generation scale-up switching system. This is not Ethernet. This is not InfiniBand. This is NVLink. This is the sixth generation NVLink. This is insanely hard to do well. It is insanely hard to do, period. And I'm just super proud of the team, NVLink, completely liquid cooled. This is the brand-new Groq system, and I'll show you a little bit more about it. This system, 8 Groq chips, this is the LP30.
The world has never seen it. Anything that the world has ever seen is V1. This is third generation, and we're in volume production now. And I'll show you more about that in just a second. The world's first CPO Spectrum-X switch. This is also in full production, co-packaged optics. Optics comes directly onto this chip, interfaces directly to silicon, electrons gets translated to photons and it gets directly connected to this chip.
We invented the process technology with TSMC. We're the only one in production with it today. It's called cuOpt. It's completely revolutionary. NVIDIA is in full production with Spectrum-X.
This is the Vera system, twice the performance per watt of any CPUs in the world today. It is also in production. Well, we never thought we would be selling CPU stand-alone. We are selling a lot of CPU stand-alone. This is already, for sure, going to be a multibillion-dollar business for us. So I'm very, very pleased with our CPU architects. We've designed a revolutionary CPU. And this is the CX9 powered with Vera CPU, the BlueField-4 STX, our new storage platform, okay?
So these are the -- these are the racks and it's connected each one of these racks, the NVLink rack, this is -- I've shown you guys this before. It's super heavy. It seems to get heavier every year because I think there's just more cables in there every year. And so this is the NVLink rack.
We've also taken this technology because it is so efficient to create a data center with these cabling systems, structured cables. So we decided to do that for Ethernet. So this is Ethernet 256 liquid-cooled nodes in one rack, and it is also connected with these incredible connectors. You guys want to see Rubin Ultra. So this is the Rubin Ultra compute node. Unlike Rubin that slides in horizontally, Rubin Ultra goes into a whole new rack, it's called Kyber that enables us to connect 144 GPUs in 1 NVLink domain.
And so the Kyber rack, this -- I could lift it, I'm sure, but I won't. It's quite heavy. This is one compute node, and it slides into the Kyber rack vertically. This is where it connects into. This is the mid-plane. The Kyber racks, those 4 top NVLink connectors slide in and connect into this, and this becomes one of the nodes.
And each one of these racks is a different compute node, and this is the amazing part. This is the midplane -- and the back of the midplane, instead of the cabling system, which has its limits in terms of how far we could drive cables, copper cables, we now have this system to connect 144 GPUs.
This is the new NVLink. This sits also vertically and it connects into the midplanes on the back, compute in the front, NVLink switches in the back, one giant computer, okay? So that is Rubin Ultra.
As I mentioned -- how about we take this back down? I need the rest of my slides. Oh, it's coming down. Okay. Thank you, Gennie. This is what happens when you don't practice. Okay. All right. So you saw -- take your time, don't get hurt. You saw this slide. Only NVIDIA's keynote where you see last year's slide presented again. And the reason for that is I just want to let you know that last year, I told you something very, very important. And it's so important, it's worthwhile to tell you again.
This is probably the single most important chart for the future of AI factories. And every CEO in the world will be tracking it, will be studying it very deeply. It's much, much more complicated than this. It's multidimensional, but you will be studying the throughput and the token speed of your AI factories, the throughput, token speed at ISO power because that's all the power you have, throughput and token speed for your factories forever. And that analysis is going to lead directly to your revenues. What you do this year will show up precisely next year as your revenues.
And this chart is what it's all about. And I said on the vertical axes, on the vertical axes -- thank you, guys. On the vertical axes is throughput.
On the horizontal axis is token rate. Today, I'm going to show you this. Because we're -- because we're now able to increase the token speed and because model sizes are increasing because the token length, the context length depending on the different grades of different application use case continues to grow from maybe 100,000 tokens input length to maybe millions.
The token input length is growing and also the output token length is growing. And so all of these play into ultimately the marketing and the pricing of future tokens. Tokens are the new commodity. And like all commodities, once it reaches an inflection, once it becomes mature or becomes maturing, it will segment into different parts. The high throughput, low speed could be used for the free tier.
The next tier could be the medium tier, larger model maybe, higher speed for sure, larger input context length. That translates to a different price point. You could see from all the different services, this one is free. It's a free tier. The first tier could be $3 per million tokens. The next tier could be $6 per million tokens. You would like to be able to keep pushing this boundary because the larger the model, smarter, the more input token context length, more relevant, the higher the speed -- the long -- the more you can think and iterate smarter AI models. So this is about smarter AI models. And when you have smarter AI models, each one of these clicks allows you to increase the price. So this is $45. And maybe one day, there'll be a premium model that allows you a premium service that allows you to generate token speeds that are incredibly high because you're in a critical path or maybe you're doing really long research and $150 per million tokens is just not a thing. So let's translate that.
Suppose you were to use 50 million tokens per day as a researcher at $150 per million tokens. As it turns out, as a research team, that's not even a thing. So we believe that this is the future. This is where AI wants to go. This is where it is today. It had to start here to establish the value and establish its usefulness and get better and better and better. In the future, you're going to see most services encompass all of that. This is Hopper. Hopper started, and I moved the chart, this is 50, this is 100. Hopper looks like this. And you would have expected Hopper, the next generation to be higher, but nobody would have expected it to be that much higher. This is Grace Blackwell.
What Grace Blackwell did is at your free tier, increase your throughput tremendously. However, where you mostly monetize your service it increased your throughput by 35x. This is no different than any product that every company makes. The higher the tier, the higher the quality, the higher the performance, the lower the volume, the lower the capacity. And so it is no different than any other business in the world.
And so now we're able to increase this tier by 35x, and we introduced a whole new tier. This is the benefit of Grace Blackwell, a huge jump over Hopper. Well, this is what we're doing with -- okay. So this is Grace Blackwell, okay? Let me just reset this, and this is Vera Rubin. -- okay? Now just think what just happened. At every single tier, at every single tier, at every single tier, we increased the throughput.
And at the tier that where your highest ASP and your most valuable segment, we increased it by 10x. That is the hard work. This is incredibly hard to do out here. This is the benefit of NVLink72. This is the benefit of extremely low latency. This is the benefit of extreme codesign that we could shift the entire area up.
Now what does it mean from a customer perspective in the end? Suppose I were to take all of that and I just multiply it against -- suppose I took 25% of my power, used it in a free tier, 25% of my power in the medium tier, 25% of my power in the high tier and 25% of my power in the premium tier. My data center only has a gigawatt. And so I get to decide how I want to distribute. The free tier allows me to attract more customers.
This allows me to serve my most valuable customers. And the combination, the product of all that allows you basically your revenues. The revenues you can generate, assuming this simplistic example, allows Blackwell to generate 5x more revenues, Vera Rubin to generate 5x. So Vera Rubin, you should get there as soon as you can.
And the reason for that is because your cost of tokens goes down and your throughput goes up. But we want even more. We want even more. And so let me just show you back to this. This is -- as I told you, this throughput requires a ton of flops. This latency, this interactivity requires enormous amount of bandwidth.
Computers don't like extreme amount of flops, extreme amount of bandwidth because there's only so much surface area for chips that any systems has. And so optimizing for high throughput and optimizing for low latency are, in fact, enemies of each other. And so this is what happened when we combined with Groq, okay?
And so we acquired the team that worked on the Groq chips and licensed the technology, and we've been working together now to integrate the system. This is what that looks like. So at the most valuable tier -- at the most valuable tier, we're now going to increase performance by 35x. Now this very simple chart revealed to you exactly the reason why NVIDIA is so strong in the vast majority of the workloads so far.
And the reason for that is because up in this area, throughput matters so much. NVLink 72 is so game changing. It is exactly the right architecture, and it's even hard to beat even as you add Groq to it. However, if you extended this chart way out here and you said you wanted to have services that delivers not 400 tokens per second, but 1,000 tokens per second, all of a sudden, NVLink 72 runs out of steam and it simply can't get there. We just don't have enough bandwidth. And so this is where Groq comes in, and this is what happens when we push that out.
So it goes out beyond -- thank you -- it goes out beyond even the limits of what NVLink 72 can do. And if you were to do that, translate that into revenues, relative to Blackwell, Vera Rubin is 5x. If most of your workload is high throughput, I would stick with just 100% Vera Rubin. If a lot of your workload wants to be coding and very high-value engineering token generation, I would add Groq to it.
I would add Groq to maybe 25% of my total data center. The rest of my data center is all 100% Vera Rubin. And so that gives you a sense of how you would add Groq to Vera Rubin and extend its performance and extend its value even more. This is what happens. This is a contrast. The reason why Groq was so attractive to me is because their computing system, a deterministic data flow processor, it is statically compiled. It is compiler scheduled, meaning the compiler figures out when do the compute -- the compute and the data arrives at the same time.
All of that is done statically in advance and scheduled completely in software. There's no dynamic scheduling. The architecture is designed with massive amounts of SRAM, it is designed just for inference, this one workload. Now this one workload, as it turns out, is the workload of AI factories. And as the world continues to increase the amount of high-speed tokens it wants to generate with super smart tokens it wants to generate, the value of this integration is going to get even higher.
And so these are 2 extreme processors you could see. One chip, 500 megabytes, 1 Vera Rubin chip -- 1 Rubin chip, 288 gigabytes -- it would take a lot of Groq chips to be able to hold the parameter size of Rubin as well as all of the context that has to go -- the KV Cache that has to go along with it. So that limited Groq's ability to really reach the mainstream to really take off until we had a great idea.
What have we disaggregated inference altogether with a piece of software called Dynamo? What have we rearchitected the way that inference is done in the pipeline so that we could put the work that makes perfect sense on Vera Rubin and then offload the decode generation, the low latency, the bandwidth limited challenged part of the workload for Groq. And so we united, unified processors of extreme differences, one for high throughput, one for low latency.
It still doesn't change the fact that we need a lot of memory. And so Groq, we're just going to add a whole bunch of Groq chips, which expands the amount of memory it has. And so if you could just imagine, out of 1 trillion parameter model, we have to store all of that in Groq chips.
However, it sits next to NVIDIA Vera Rubin, where we could hold the massive amounts of KV Cache that's necessary in processing all of these agentic AI systems. It's based upon this idea of disaggregated inference. We do the prefill, that's the easy part, but we also tightly integrate the decode.
So the attention part of decode is done on NVIDIA's Vera Rubin, which needs a lot of math and the feed forward network part of it, the decode part is done -- the token generation part is done on Vera Rubin -- on the Groq chip. The 2 of them working tightly coupled together over today, Ethernet with a special mode to reduce its latency by about half.
And so that capability allows us to integrate these 2 systems. We run Dynamo, this incredible operating system for AI factories on top of it, and you get 35x increase, not to mention additional new tiers of inference performance for token generation the world has never seen. So this is it. This is Groq. The Vera Rubin systems, including Groq I want to thank Samsung, who manufactures the Groq LP30 chip for us, and they're cranking as hard as they can. I really appreciate you guys.
We're in production with the Groq chip, and we'll ship it in the second half, probably about Q3 time frame, okay? Groq LPX. Vera Rubin, it's kind of hard to imagine any more customers.
And the really great thing is Grace Blackwell, early sampling of it was really complicated because of coming together of NVLink 72, but the sampling of Vera Rubin is just going incredibly well. And in fact, Satya, I think, texted out already that the first Vera Rubin rack is already up and running at Microsoft Azure.
And so I'm super excited for them. We're going to keep cranking these things out. We have now set up a supply chain that can manufacture thousands a week of these systems, essentially multi-gigawatts of AI factories per month inside our supply chain. And so we're going to crank out these Vera Rubin racks while we're cranking out the GB 300 racks. We are in full production.
The Vera CPUs incredibly successful. And the reason for that is because AI needs CPUs for tool use. and Vera CPU was designed just perfectly for that sweet spot, incredible for the next generation of data processing. Vera CPU is ideal. The Vera CPU plus BlueField plus CX9 connected into the BlueField 4 stack, 100% of the world's storage industry is joining us on this system.
And the reason for that is because they see exactly the same thing. The storage system is going to get pounded. It's going to get pounded because we used to have humans using the storage systems. We used to have humans using SQL.
Now we're going to have AIs using these storage systems, and it's going to store cuDF accelerated storage, cuVS accelerated storage as well as very importantly, KV Caching, okay? So this is the Vera Rubin system. Now what's amazing is this, in just 2 years' time, in a 1 gigawatt factory -- in just 2 years' time, in 1 gigawatt factory, using the mathematics that I showed you earlier, whereas Moore's Law would have given us a couple of steps, we would have x factored the number of transistors. We would have x factored the number of flops. We would have x factored the number of amount of bandwidth.
But with this architecture, we're going to take our token generation speed, token generation rate from 2 million to 700 million, 350x increase. This is the power of extreme co-design. This is what I mean when we integrate and optimize vertically, but then we open it horizontally for everybody to enjoy. This is our road map very quickly. Blackwell is here, the Oberon system. In the case of Rubin, we have the Oberon system. We're always backwards compatible so that if you wanted to not change anything and just keep on moving through with the new architecture, you could do so.
The old system -- the standard rack system, Oberon, still available. Oberon is copper scale up. And with Oberon, we could also use optical scale-out -- or excuse me, optical scale up to expand to NVLink 576, okay? And so there's a lot of conversation about is NVIDIA going to copper scale up or optical scale up?
We're going to do both. So we're going to have NVLink 144 with Kyber and then with Oberon, we're going to NVLink 72 plus optical to get to NVLink 576. The next generation of Rubin with Rubin Ultra, we have the Rubin Ultra chip, which is coming -- which is taping out.
And we have a brand-new chip, LP35. LP35 will, for the first time, incorporate NVIDIA's NVFP4 computing structure, give you another few X factor speed up, okay? And so this is Oberon NVLink 72 optical scale up, and it uses Spectrum 6, the world's first co-packaged optical and all of this is in production. The next generation from here is Feynman. Feynman has a new GPU, of course. It also has a new LPU, LP40, big step up, incredible, incredible new technology.
Now uniting the scale of NVIDIA and the Groq team building together LP40 is going to be incredible. A brand-new CPU called Rosa, short for Rosline, BlueField 5, which connects the next CPU with the next SuperNIC CX10. We will have Kyber which is copper scale up, we will also have Kyber CPO scale up.
So for the first time, we will scale up with both copper and co-packaged optics, okay? And so a lot of people have been asking, Jensen, is copper going to still be important? The answer is yes. Jensen, are you going to scale up optical? Yes. Are you going to scale out optical? Yes.
And so for everybody who is in our ecosystem, we need a lot more capacity, and that's really the key. We need a lot more capacity for copper. We need a lot more capacity for optics. We need a lot more capacity for CPO. And that's the reason why we've been working with all of you to lay the foundation for this level of growth. And so Feynman will have all of that.
Let me see if I missed everything. That's it. Every single year, brand-new architecture. Very quickly, NVIDIA went from a chip company to an AI factory company or AI infrastructure company, AI computing company, these systems. And now we're building entire AI factories. There's so much power that is squandered in these AI factories. We want to make sure that these AI factories come together, designed in the best possible way.
Most of these components never meet each other. Most of us technology vendors. Now we all know each other. But in the past, we never met each other until the data center. That can't happen. We're building super complex systems. And so we have to meet each other virtually somewhere else. And so we created Omniverse and the Omniverse DSX world, a platform where all of us can meet and design these gigafactories, gigawatt AI factories virtually in system.
We have simulation systems for the racks for mechanical, thermal, electrical, networking, those simulation systems integrated into all of our ecosystem partners of incredible tools companies. We also operate it, connect it to the grid so that we could interact with each other, send each other information so that we could adjust grid power and data center power accordingly, saving energy.
And then inside the data center using Max-Q so that we could adjust the system dynamically across power and cooling and all of the different technologies we all work on together so that we leave no power squandered, so that we run at the most optimal rate to deliver enormous amount of token throughput.
There's no question in my mind, there's a factor of 2 in here. And the factor of 2 at the scale we're talking about is gigantic. We call this the NVIDIA DSX platform. And just as all of our platforms, there's the hardware layer, there's the library layer and there's the ecosystem layer. It's exactly the same way. Let's show it to you.
[Presentation]
It's incredible, right? Well, Omniverse was designed to hold the world's digital twin, starting from the Earth, and it's going to hold digital twins of all sizes. And so we have such a great ecosystem of partners. I want to thank all of you. All of these companies are brand new to our world.
We didn't know many of you just a couple of years ago. And now we're working so close together to work on and build together the largest computer the world's ever seen and also to do it at planetary scale. So NVIDIA DSX is our new AI factory platform. I'll spend very little time on this time.
However, we're going to space. We've already been out in space. Thor is radiation approved, and we're in satellites. You do imaging from satellites in the future. We'll also build data centers in space. Obviously, very complicated to do so. We have -- we're working with our partners on a new computer called Vera Rubin Space-1, and it's going to go out to space and start data centers out in space.
Now of course, in space, there's no conduction, there's no convection. There's just radiation. And so we have to figure out how to cool these systems out in space, but we've got lots of great engineers working on it. Let me talk to you about something new.
So Peter Steinberger is here, and he wrote a piece of software. It's called OpenClaw. And I don't know if he realized how successful it's going to be, but the importance is profound. OpenClaw is the #1. It's the most popular open source project in the history of humanity, and it did so in just a few weeks.
It exceeded what Linux did in 30 years, and it's that important. It is that important. It will do -- well, this is all you do, okay? We're announcing our support of it. Let me just quickly go through this. I want to show you a couple of things. You simply type this type this into a console and it goes out, it finds OpenClaw, it downloads it. It builds you an AI agent, and then you could tell it whatever else you need to do, okay? So let's take a look.
[Presentation]
Incredible. Now I illustrated effectively what Open claw is in this way so that all of you can understand it, but let's just think what happened. What is open? It connects -- it's agentic system. It calls and connects to large language models. So the first thing it has, it has resources that it manages.
It could access tools, it could access file systems, it could access large language models. It's able to do scheduling. It's able to do cron jobs. It's able to decompose a problem that a prompt that you gave it into step by step by step. It could spun off and call upon other subagents. It has I/O. You could talk to it in any modality you want. You could wave at it and understand you.
You could talk to any modality you want. It sends you messages, it text you, send you e-mail. So it's got I/O. What else does it have? Well, based on that, you could say, in fact, it's an operating system. I've just used the same syntax that I would describe an operating system. OpenClaw has open sourced essentially the operating system of agentic computers. It is no different than how Windows made it possible for us to create personal computers.
Now OpenClaw has made it possible for us to create personal agents. The implication is incredible. The implication is incredible. First of all, the adoption says something all in itself. However, the most important thing is this, every single company now realize every single company, every single software company, every single technology company for the CEOs, the question is, what's your OpenClaw strategy? Just as we need to all have a Linux strategy. We all needed to have HTTP, HTML strategy, which started the Internet.
We all needed to have a Kubernetes strategy, which made it possible for mobile cloud to happen. Every company in the world today needs to have an OpenClaw strategy and agentic system strategy. This is the new computer. Now this is just the exciting part. This is enterprise IT before OpenClaw. And I mentioned earlier, the way enterprise IT works and the reason why it's called data centers is because these large rooms, these large buildings held data, held the files of people, the structured data of business.
It would pass through software that has tools and systems of records and all kinds of workflow that's codified into it, and that turns into tools that humans would use. Digital workers would use. That is the old IT industry, software companies creating tools, saving files and of course, GSIs consultants that help companies figure out how to use these tools and integrate these tools.
These tools are incredibly valuable for governance and security and privacy and compliance and all of that continues to be true. It's just that post-Open Claw, post agentic, this is what it's going to look like. This is the extraordinary part. Every single IT company, every single company, every company, every SaaS company will become a [ GaaS ] company. No question about it. Every single SaaS company will becoming a [ GaaS ] company and Agentic-as-a-Service company. And what's amazing is this, you know OpenClaw gave us -- gave the industry exactly what it needed at exactly the time.
Just as Linux gave the industry exactly what it needed exactly the time, just as Kubernetes showed up at exactly the right time, just as HTML showed up. It made it possible for the entire industry to grab on to this open source stack and go do something with it. There's just one catch. Agentic systems in the corporate network can have access to sensitive information, it can execute code and it can communicate externally. Just say that out loud, okay? Think about it. Access sensitive information, execute code, communicate externally.
You could, of course, access employee information, access supply chain, access finance information and send it out, communicate externally. Obviously, this can't possibly be allowed.
And so what we did was we worked with Peter. We took some of the world's best security and computing experts, and we worked with Peter to make OpenClaw, enterprise secure and enterprise private capable. And we call that -- this is our NVIDIA OpenClaw reference for Open -- NemoClaw, which is a reference for OpenClaw, and it has all these agentic AI toolkits. And the first part of it is technology we call OpenShell that has now been integrated into OpenClaw.
Now it's enterprise ready. This stack with a reference design we call NemoClaw, okay? With a reference stack we call NemoClaw, you could download it, play with it and you could connect to it the policy engine of all of the SaaS companies in the world. And your policy engines are super important, super valuable. So the policy engines could be connected, NemoClaw or OpenClaw with OpenShell would be able to execute that policy engine. It has a network guardrail. It has a privacy router. And as a result, we could protect and keep the claws from executing inside our company and do it safely.
We also added several things to the agentic system. And one of the most important things you want to do with your own custom claws is so that you can have your custom models. And this is NVIDIA's open model initiative. We are now at the frontier of every single domain of AI models, whether it's Nemotron, Cosmos' world foundation model; Groot, artificial general robotics, human robotics models; Alpamayo for autonomous vehicle; BioNeMo for digital biology; Earth-2 for AI physics, we are at the frontier on every single one. Take a look.
[Presentation]
Our models -- Thank you. Our models are valuable to all of you because number one, it's on the top of the leaderboard. It's world-class. But most importantly, it's because we are not going to give up working on it. We're going to keep on working on it every single day.
Nemotron 3 is going to be followed by Nemotron 4. Cosmos 1 was followed by Cosmos 2. Groot at generation 2. Each and one of these, we're going to continue to advance these models, vertical integration, horizontal openness, so that we can enable everybody to join the AI revolution, #1 on leaderboard across research and voice and world models and artificial general robotics and self-driving cars and reasoning. And of course, one of the most important one, this is Nemotron 3 in OpenClaw.
This is Nemotron 3 and OpenClaw and look at the top 3, they are the 3 best models in the world, okay? So we are at the frontier. It is also true that we want to create the foundation model so that all of you could fine-tune it and post-train it into exactly the intelligence you need.
This is Nemotron 3 Ultra. It is going to be the best base model the world has ever created. This allows us to help every country build their sovereign AI, and we're working with so many different companies out there. And one of the most exciting things that we're doing today, I'm announcing today, is a Nemotron coalition.
We are so dedicated to this. We have invested billions of dollars of AI infrastructure so that we could develop the core engines for AI that's necessary for all the libraries of inference and so on, but also to create the AI models to activate every single industry in the world.
Large language models is really important. Of course, it's important. How could human intelligence not be? However, in different industries around the world, in different countries around the world, you need to have the ability to customize your own models and the domain of the models is radically different from biology to physics, to self-driving cars, to general robotics to, of course, human language. And we have the ability to work with every single region to create their domain-specific, their sovereign AI.
Today, we're announcing a coalition to partner with us to make Nemotron 4 even more amazing. And that coalition has some amazing companies in it. Black Forest Labs, imaging company; Cursor, the famous coding company, we use lots of it; LangCain, billion downloads for creating custom agents; Mistral, Arthur mentioned, I think he's here, incredible, incredible company. Perplexity, Perplexity computer, absolutely use it. Everybody use it. It is so good, a multimodal agentic system; Reflection; Sarvum from India; Thinking Machine, Mira Murati's lab, incredible companies joining us. Thank you.
I said that every single enterprise company, every single software company in the world needs an agentic systems, need an agent strategy. You need to have an OpenClaw strategy, and they all agree. And they're all partnering with us to integrate NeMo, the NemoClaw reference design, the NVIDIA agentic AI toolkit and of course, all of our open models. One company after another, there are so many, and we're partnering with all of you.
I'm really grateful for that. And this is our moment. This is a reinvention. This is a renaissance, a renaissance of the enterprise IT from what would be a $2 trillion industry, this is going to become a multitrillion dollar industry, offering not just tools for people to use, but agents that are specialized in very special domains that you're expert in that we could rent. I could totally imagine in the future, every single engineer in our company will need an annual token budget.
They're going to make a few hundred thousand dollars a year of their base pay. I'm going to give them probably half of that on top of it as tokens so that they could be amplify 10x. Of course, we would. It is now one of the recruiting tools in Silicon Valley. How many tokens comes along with my job.
And the reason for that is very clear because every engineer that has access to tokens will be more productive. And those tokens, as you know, will be produced by AI factories that all of you and us, we partner to build, okay? So every single enterprise company in today sit on top of file systems and data centers.
Every single software company of the future will be agentic and there will be token manufacturers. There'll be token users for their engineers, and there'll be token manufacturers for all of their customers.
The OpenClaw's even -- the OpenClaw event cannot be understated. This is as big of a deal as HTML. This is as big of a deal as Linux. We have now a world-class open agentic framework that all of us could use to build our OpenClaw strategy. And we've created a reference design we call NemoClaw that all of you could use that is optimized, it's performant, it is safe and secure.
Speaking of agents, agents, as you know, perceive, reason and act. Most of the agents in the world today that I've spoken about are digital agents. They act in the digital world. They reason, they write software. It's all digital. But we also have been working on physically embodied agents for a long time.
We call them robots. And the AIs that they need are physical AIs. We have some big announcements here. I'm going to just walk through a few of them, 110 robots here, almost every single company in the world, I can't think of one that are building robots is working with NVIDIA.
We have 3 computers, the training computer, the synthetic data generation and simulation computer and of course, the robotics computer that sits inside the robot itself. We have all the software stacks necessary to do so, the AI models to help you. And all of this is integrated into ecosystems around the world and all of our partners from Siemens to Cadence, incredible partners everywhere. And today, we're announcing a whole bunch of new partners.
As you know, we've been working on self-driving cars for a long time. The ChatGPT moment of self-driving cars has arrived. We now know we could successfully autonomously drive cars. And today, we are announcing 4 new partners for NVIDIA's robotaxi-ready platform. BYD, Hyundai, Nissan, Geely, altogether, 18 million cars built each year, joining our partners from before, Mercedes, Toyota, GM, the number of robotaxi-ready cars in the future are going to be incredible.
And we're announcing also a big partnership with Uber. Multiple cities, we're going to be deploying and connecting these robotaxi-ready vehicles into their network. And so a whole bunch of new cars. We have ABB, Universal Robotics, KUKA, so many robotics companies here, and we're working with them to implement our physical AI models, integrated into simulation systems so that we could deploy these robots into manufacturing lines all over.
We have Caterpillar here. We even have T-Mobile here. And the reason for that is in the future, that radio tower used to be a radio tower is going to be an NVIDIA aerial AI RAN. And so this is going to be a robotics radio tower, meaning it can reason about the traffic, figure out how to adjust its beam forming so that it could save as much energy as possible and increase the amount of fidelity as possible. There are so many humanoid robots here, but one of my one of my favorites is a Disney robot. You know what? Tell you what, let me just show you some of the videos. Let's look at that first.
[Presentation]
Ladies and gentleman, Olaf. So Newton works -- Omniverse works. Olaf, How are you?
I'm so happy now that I'm leading you.
I know because I gave you your computer, Jetson.
What's that.
Well, it's in your tummy.
That's going to [indiscernible]
And you learn how to walk inside Omniverse [indiscernible] and it was because of physics using this Newton solver that runs on top of NVIDIA Warp that we jointly developed with Disney and with DeepMind that made it possible for you to be able to adapt to the physical world. [indiscernible] that's how smart you are.
I'm a snowman, not a snow [indiscernible].
Could you imagine this, the future of Disneyland, all these robots, all these characters wondering around. I have to admit though, I thought you were going to be taller. I've never seen such a short snowman, to be honest. Hey, tell you what, you want to help me out? Okay.
Usually, I close the keynote by telling you what I told you. We talked about inference inflection. We talked about the AI factory. We talked about the OpenClaw agent revolution that's happening.
And of course, we talked about physical AI and robotics. But tell you what, why don't we get some friends to help us close it out?
Of course.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — NVIDIA GTC AI Conference 2026
NVIDIA — NVIDIA GTC AI Conference 2026
Überblick
Das vorliegende Transkript stammt von einer NVIDIA-GTC-Kundgebung, nicht von einem klassischen Earnings Call. Im Fokus stehen Plattformarchitektur, CUDA-X, AI factories, das Ökosystem sowie konkrete technische Fortschritte und Roadmaps statt konkreter Quartalszahlen.
Wichtige Kennzahlen
- Umsatz/Gewinn/Margen/EPS: im Transkript nicht genannt; es liegen keine Vorjahres-/Vorquartalsvergleiche vor.
- Wichtige Leistungskennzahlen:
- Tokens pro Sekunde: von 700 auf nahezu 5.000 Token/s demonstriert (7x Steigerung).
- Durchsatz pro Watt (Grace Blackwell/NVLink 72): 35x bessere Effizienz im Vergleich zu Referenzdaten.
- Inferenz-Durchbruch: 2 Mio. Token/s auf 700 Mio. Token/s beschrieben, 350x Steigerung durch Integration von Vera Rubin, Groq und Dynamo.
- Hyperscaler-Anteil am Geschäft: 60% des Geschäfts, mit interner AI-Nachfrage (z. B. REXUS, Suche) innerhalb der Top-5-Hyperscaler.
- Ausblick-Ansätze: Bis 2027 wird eine Milliarde-Dollar-Bandbreite genannt, konkret: eine Einnahmenprognose von mindestens 1 Trillion Dollar für die Infers-Phase wird geäußert.
Strategische Ausrichtung
- Vertikal integrierte, zugleich horizontal offene Plattformstrategie: Hardware, Software, Libraries (CUDA-X) sowie Ökosystem-Partnern bauen gemeinsam an AI-Fabriken.
- CUDA-X als Herzstück; cuDF (strukturierte Daten) und cuVS (Vektorspeicher) als zentrale Bibliotheken für Zukunftsanwendungen.
- OpenClaw/NemoClaw als open-agentic-Framework mit Fokus auf Enterprise-Sicherheit; Nemotron-Coalition für domänen-spezifische sovereign AI.
- AI-Fabriken-Konzept: Disaggregierung von Inferenz (Dynamo), NVLink-72-Architektur, Groq-Integration, Vera Rubin, Grace Blackwell; gleichzeitige Skalierung von Speicher, KV-Cache und Tools.
- Kooperationen mit Cloud-Anbietern (AWS, Google Cloud, Microsoft Azure) sowie OpenAI, Anthropic; vertrauliche Berechnungen (confidential computing) als Sicherheitsstandard.
- Omniverse/DSX-Plattformen für digitale Zwillinge, Simulation und gemeinsames Designen gigafaktorieller KI-Infrastrukturen.
Ausblick & Guidance
H Huang betont eine erneute Inflection im Bereich Inferenz: Die Nachfrage nach KI-Infrastruktur könnte laut Ausblick bis 2027 mindestens 1 Billion US-Dollar erreichen. Vera Rubin-Racks sollen zeitnah (H2, voraussichtlich Q3) in Produktion gehen; Groq-LP30/LPX-Entwicklungen erweitern Kapazität und Leistung. Die Lieferkette sei in der Lage, Tausende von Systemen pro Woche zu produzieren; OpenClaw/NemoClaw bilden die Grundlage für agente-IT-Strategien in Unternehmen. Neue Partnerschaften, domänen-spezifische Modelle (Nemotron-Familie) und eine globale Koalition zur souveränen KI werden hervorgehoben.
Analystenfragen
- Keine analysenorientierte Q&A-Sequenz im Transkript vorhanden; Analystenfragen werden nicht dokumentiert.
NVIDIA — Morgan Stanley Technology
1. Management Discussion
Wow, no music, no walk on music, no roaring applause. I'm just saying, I'm not used to coming to work in this way, this total silence. I'm just kidding.
2. Question Answer
There were a lot of Taylor Swift comments along the way. So crowd is ready.
This company needs humor to it. Is humor allowed here?
Humor is very allowed. I made investment banking jokes yesterday, Jensen. But, thank you for being here for the last, I think, 25, 27 years, you've been such a great supporter of this conference. I think we sometimes become numb to the scale of the numbers and the transformation we're experiencing. I don't think I'm the only one in this audience. I'm getting billions and trillions confused constantly.
My partner, Mark Edelstone and I, 27 years ago, we sat on a stage much smaller than this one on the Morgan Stanley trading floor, and we announced and introduced NVIDIA and you to the Morgan Stanley sales force. And believe it or not, $48 million IPO, 1998 revenue -- trailing revenue, $30 million. Jensen and his team, Colette, were so generous. Two years ago, you hosted our Board meeting in your headquarters. I think you had just announced a $30 billion quarter in terms of revenue. And then last week, a $46 billion net income quarter.
So we've moved from years to quarters from millions to billions. It's really amazing and unprecedented scale and growth. And then you've changed our lives. You've changed our lives.
And so I guess my question after that is what had to come together strategically, culturally, technically to deliver that type of hyper growth at scale, and the scale is really astounding? And again, thank you.
That's going to take 37 minutes and 13 seconds and slightly more. Obviously, NVIDIA wasn't built overnight. It's taken us 33 years. I sort of remember somehow that when we went public, our price was $13, and I just read your's $12. I overstated it. I remembered it to be much more optimistically than it was actually. The company's valuation at the time, I think, was like $300 million. And Mark did such a good job, Mark Edelstone did such a good job preparing all of our investors that they really only had 1 question. It was literally a one-question IPO roadshow.
And the question was, when are you going out of business? I'm not kidding. And that question is about as hard to answer as the one you just gave me.
Well, the answer is, as it turns out, we started the company with the idea of creating a new computing platform, a new way of doing computing. And not that the old way was wrong, it's just that the new way, a new way is essential to solve some unique problems. And the type of things that we were extremely good at are algorithms, algorithms, because the inner loop of the software tends to be about 5% of the code, but 99% of the compute time. And back then, the algorithms in the world of computers was quite rare. And one of the most important algorithms was computer graphics, the simulation of light and how light travels through space.
And so while computer graphics was used for things like animation movies, of course, at the time that we were founded, the cover of, I forget which magazine was, Jurassic Park was there. And so it was really -- it was during that time where computer graphics was becoming more capable, and we could simulate virtual reality with it, and we applied it to creating a new industry, which did not exist at the time called video games. And so 3D graphics was modernized in my time, consumerized in my time. And the whole video game industry was created in my time.
And when I say in my time, meaning it was NVIDIA that pulled it all together. The reason why we're so beloved in the video game industry and we're so deep in it still is, in a lot of ways, we created the modern video game industry. From the algorithms associated with it, the libraries. In the computer graphics industry without RTX, there would be nothing today. Without our contribution of all the algorithms that goes into all of the game engines, you wouldn't be able to enjoy the type of video games you enjoy today. So NVIDIA has been deep in the world of algorithms since day 1, 33 years ago.
Now accelerated computing requires what is described as a full stack, meaning the architecture, the chip design, the libraries that sit on top of it, how it's integrated forwardly, I'm using that. Apparently, there's a new idea called forward deployed engineers or something like that. NVIDIA's had DevTech engineers 33 years ago. We deployed them into the world's video game industries and video game companies and game engines, and we integrate our technology into their game engine. Today, if you look at Epic's Unreal Engine, NVIDIA's technology is all over it. And you go into every game developer, NVIDIA's technology is all over. That's the reason why all the games run best on NVIDIA for good reason. That's the reason why NVIDIA is the world's largest game platform.
You probably don't know this, but there are several hundred million active GeForce gamers in the world. Many of them turned into AI researchers, is because of GeForce GTX 580 that Ilya Sutskever and Alex Krizhevsky and Jeff Hinton, it was Jeff that told them to go buy it to discover CUDA. And so the first idea about NVIDIA is that we're a full stack company.
The second idea about our company, and this is really old history that many people might not have been born yet. But during that time, the PC architecture was incompatible with today's computer graphics capabilities. And we created some new technology called Direct NVIDIA. It was a way for applications to directly communicate with our APIs. And we exposed it to some very important companies, it became DirectX. If you look at the way that we communicate between us and the application, that was completely revolutionary to bypass a whole bunch of software that makes it slow, right, to make accelerated computing possible.
We introduced the idea of virtualized frame buffer memory into system memory. It was initially called AGP, which then became PCI Express. Many of the system architecture had to be reinvented so that we could accommodate video games and 3D graphics in a PC. Well, that same sensibility of both innovating the full stack to be integrated in algorithms as well as changing the architecture of systems so that we could create new computer systems led to that same sensibility expertise led to DGX-1, which was the world's first AI supercomputer that delivered by hand to San Francisco here very close by to a company that eventually became OpenAI.
And so the fundamental attitude, if you will, expertise, how we see the world propagated in this way. It's literally 33 years. The company's entire culture is designed to be full stack. The organization is designed to be full stack. The entire system is designed to create new stacks and new system architectures that allow us to do this. Well, we started with, of course, the -- if you look at NVIDIA's graphics cards, GeForce, it's a technology marvel. How it's integrated into the operating system, how it's integrated into the system architecture, completely reinvented how computers worked before.
Well, we have no trouble with that with DGX-1. I have no trouble with that with the first supercomputing cluster, which then went to Satya for their first supercomputer. And you might -- people have noticed that Microsoft's first supercomputer and NVIDIA's supercomputer had exactly the same benchmark, like down to the -- you measure the performance of the system across all of these GPUs, that was about 10,000 GPUs or so. It was exactly the same performance. And the reason for that is because we designed it, and we delivered to Azure Cloud.
It was all based on InfiniBand. It was all based on Ampere, this is the A100, which became the first computer that OpenAI used. And so we're quite comfortable with this full stack, full system approach. And without being able to do that, it is impossible to stay at the bleeding edge. It is literally impossible to keep up with a company that's building not just one chip each year, but we're building an entire infrastructure each year because-- we own the CPU. We revolutionized the new way of designing CPUs, and you'll see more examples of that. We revolutionized the way we do CPUs, revolutionized the way we obviously do GPUs, connect them together using this thing called NVLink, which revolutionized the way you build computers all together, connected together with a new type of AI Ethernet called Spectrum-X, we connected everything together. Now we own the entire stack. We know all the chips inside.
When you own the entire stack and you own all the chips inside, you could change it every single year. If you don't own the entire stack and you don't own all the chips, it's hard to innovate every year. And the reason for that is because you're connecting too many cats and dogs, and there's too much innovation to pull together once a year if you can't control it because it's a full stack problem. So that's how we got here.
It's amazing. In the last 2 years, since you were here last and our Board meeting, we've sort of gone from generative AI models to reasoning and now agentic and Satya just finished a panel on the enterprise. And at the enterprise level, we're working with Microsoft, the OpenAI, XAI, Gemini, we have Dario here from Anthropic, the capabilities are extraordinary. What does it mean around the size of that enterprise market? How is it changing? And how is it going to be adopted? And how do you sort of see that playing out over the years because it's a big, big topic of the...
Yes, really good. literally, in the last 2 years, we went through 3 inflection points in AI. The first inflection point, first of all, the technology sat there in plain sight for months. GPT-3 sat there in plain sight for months until somebody wrote essentially a wrapper around it and turned it into ChatGPT, turned it into an API, made it available and easy to use by everybody. But the first inflection point was generative, as you mentioned. The ability to translate -- convert information from one form to another form and auto regressively generate tokens.
And the second -- but of course, the problem with generative AI is that -- that it's prone to hallucinate. And the reason for that is because -- not because there's something fundamentally wrong with the technology, not because it didn't -- it didn't learn all the right things, but because it's not grounded on contextual information. It's not grounded on relevant information. And so the second thing that happened was 01 and reasoning came about. But behind 01 is also grounding on research, grounding on truth, and the ability to have -- to combine generative with semantic, we call it retrieval augmented generation, but basically conditional generation, okay? Conditional generation, meaning that what you're about to generate depends on context and ground truth or whatever research or whatever it is.
And so the second generation introduced reasoning, self-reflection, the ability to self-correct because sometimes what comes out in your mouth, you kind of wish it you pull back and you go -- and so in the case of AI, it has the ability to do that in real time. And so 01 became much more grounded and the information that was generated was more reliable.
So what happened? What came out as a curiosity and the incredible excitement in the tech industry, jumping on to it because we realize what's about -- what can happen. The next phase of it, the usefulness of ChatGPT just skyrocketed. But the amount of tokens that it generated was much, much more than the first generation, maybe 100x more tokens. The model is maybe 10x larger. So it's probably something like 1,000x more compute. So from 01 over ChatGPT, call it 1,000x. And then because it was so useful, maybe 1 million times more usage, okay? So the combination of usage and its usefulness and groundedness allows us to -- we saw that next phase of growth.
But in the end, what 01 did was it provided information, essentially a chatbot that was much more factual. It was informational. And of course, for many of us, we use it for research, and we use it all the time. Instead of searching, our goal isn't to search, our goal is to get answers. And so ChatGPT gave us that. That was kind of the second inflection.
The inflection that we're seeing here also sat in plain sight for quite a long time. And it's basically the ability for AI to use files, access files and use tools. And so now it could reason, it could think, it could use tools, it could solve problems. And it could do search, it could do planning. And so probably the biggest phenomenon that's happening, and if you're paying attention to it, I'm sure you are, OpenClaw is probably the single most important release of software probably ever. And if you look at OpenClaw and the adoption of it, Linux took, right, some 30 years to reach this level. OpenClaw in, what is it, 3 weeks has now surpassed Linux. It is now the single most downloaded open source software in history, and it took 3 weeks.
If you look at the line, even in semilog, this thing is straight up. It's vertical. It looks like the Y-axis. I've never seen anything like it, okay? It really looks like a Y-axis. And so what's happening now? You could give a problem statement, create, start with the prompt goes, create. The last prompt -- the way you kind of think about it, the last prompt was what is, when is, who is, right? That's the last prompt. This now prompt goes create, do, build, write, does that make sense. So what's happened. The last prompt was queries. This prompt are actions. They're tasks, do something for me. And you describe it as expressively as you like with a lot of intention and let it infer or very specific, and it goes off and it just churns, it just thinks. It goes off and it does research and it reads, it reads a manual. If it has to use a tool it has never used it before, it reads the manual of the tool. It goes off and studies what's on the web and it applies the tools and performs the task.
Now I just said we went from one generative prompt -- one generative response to now one that is 1,000x more tokens and agents, we call them at the company claws. These claws are now consuming what, 1 million times more tokens. They're running continuously in the background. We have a whole bunch of claws in the company. And they're all continuously running, doing things for us, writing -- developing tools, developing software. And so now the question is the implication. The amount of compute in our company that we need has just skyrocketed. The amount of compute every company needs is skyrocketing.
So in that context, I think over the last few days, it's come out certainly at Morgan Stanley as a user, maximum bullish on tokens, maximum bullish on doing and creating. It does require the compute you just mentioned. And the question is around the financing and the CapEx around that to support that extraordinary large compute. How does it all get financed as you see it from a sort of top of the ecosystem? And how do the factory -- AI factory economics play out and evolve going forward?
And so there's a couple of thoughts that's really important. Remember, I appreciate you using the word factory. Several years ago, I described that these new -- these data centers, what people call data centers, is not for storing data as in a data center. They are producing tokens. And so a facility, a plant with the fundamental purpose of producing tokens is a factory. It's an AI factory. And at that time people said, Jensen that sounds so grungy. It's clean. And -- but it produces tokens. And nobody likes to build data centers because who knows what kind of return you're going to get on a data center, but everybody loves building factories. And the reason for that is because factories make money.
And we now know for certain that these factories directly generate tokens, and these tokens are monetizable. And the more compute you have, the more tokens you can produce, the more tokens you produce, the greater your top line. We now know for certain -- we now know for certain that company's revenues are directly correlated to compute. And we know that for a fact because if Anthropic had 3x more compute, their revenues will be 3x higher. We know that. We know that Anthropic is compute limited, factory limited. It's no different than Mercedes being factory limited or any company being factory limited. And so if they had more compute in their factories, they will have higher revenues.
If OpenAI had -- right now had more compute, they will have higher revenues. And so the first thought is that compute equals revenues. Now the big idea, of course, compute equals GDP, that we also know. Compute equals a country's GDP. And so that's one thought.
The second thought, the reason why NVIDIA is so successful is because we engineer these systems full stack end to end, and they're architected from the ground up to generate tokens at incredible effectiveness. NVIDIA's tokens per watt is an order of magnitude, an order of magnitude ahead of the competition, alternative.
Tokens per watt. Now what does that mean? Remember, your factory has 1 gigawatt. And if your tokens per watt is 10x the alternative, your revenues are 10x the alternative. For the very first time in history, the computer architecture chosen in a factory -- in a company's factory must go through CEO review, no question about it. That company only has a gigawatt or 2.3 gigawatts for next year. If they put the wrong system inside, it will affect their revenues the next year. I promise you that, and we see it. And so our architecture being so advanced now and pulling further and further ahead, those are probably one of the most exhaustive benchmarking done by a firm called SemiAnalysis, and they declared NVIDIA inference King. Inference King inference is tokens per second, tokens per watt. It's about generating tokens and tokens per dollar. When our performance per watt or per anything is so much ahead of the competition or the alternative, our tokens per dollar is also the best, which means we're the cheapest tokens you can produce today, not even close, an order of magnitude better.
And so that's the second thought. The second big idea for AI is AI is a factory because factories are power limited always. It doesn't matter how many plants you have. Each plant is still 100 megawatts or gigawatt. And therefore, tokens per watt is the single most important thing for the top line of companies. and they have to make those decisions very, very carefully. It's no longer just about PowerPoint slides. You're not going to go put $50 billion down on somebody's PowerPoint slides.
So the token demand is extraordinary, as you just mentioned. We're seeing it in your numbers, right? I think I mentioned $46 billion in net income, but $70 billion in revenue...
You're going to ask me something about how to fund it. Can I just tell you how to fund it? First of all, I just told you the reason why you have to build these factories in the future is because you either -- you just believe that, one, software is important. And so I hope this audience believes software is important. Software runs the world. The first thought.
The second idea is this. There will be no software in the future that's not agentic. Do you guys agree with that? How could you have software that's dumb? And so it is absolutely true that every software company will become an agentic company. They're going to simultaneously use open models, okay? Open models mean the ones that they download themselves and they fine-tune themselves, they're also going to use closed models. The combination of all that, just like we -- in all of our companies, we have employees that we hire. We have employees that we're grooming. We have contractors that we bring in. We have specialists like yourself that we bring into the company to do our work. Our job is not to do the job. Our job is to have to job be done. That's what every company does.
And so therefore, every company will realize that these AI models, some of it you rent, some of it you build. That's not illogical, just like biological workers, you will do that with digital workers. And so every single software company in the future will no longer just rent tools, but they'll rent also experts to use the tools. They'll not just rent tools, but rent experts that use those tools because their agents are going to be extremely good at using their specialized tools.
And so every single software company -- what is the IT industry can pull trillion dollars. Today, they're tool renters. In the future, they will, of course, have -- they'll rent agents that use those tools, which means that the software industry in the future will be much larger than the software industry of today. You pick your favorite software companies, and I can imagine a much, much larger future for them. Cadence is going to be much larger. Synopsys is going to be much larger. Siemens is going to be much larger in the future. But their business profile will change because today, they're basically a software licensing company. In the future, they will also rent tokens, specialized tokens, which also means that, that $2 trillion industry today with no token consumption in the future will be extraordinary token consumers. That's where that money is going to come from.
They're -- all of those software -- the IT industry of today, not the enterprise companies, the IT industry alone is going to shift an enormous -- it's going to consume enormous amounts of tokens in the clouds and they're either open models or [indiscernible].
So that extraordinary token economy is facing some constraints. So we've got memory constraints. We've got power permitting constraints. I was in Texas with builders. We have electrician constraints. How do you see that playing out? Satya raised it in the last session, you're closer to it. And also, if it takes a little longer, is it still okay? Or is it really negative if we just -- if the cycle on building this extraordinary...
I love constraints. I love constraints. And the reason for that is because in a world of constraint, you have no choice but to choose the best. You can't squander your choice. If the data centers -- if the land power and shell is constrained, you're not going to randomly put something in there just to try it out. You're going to put something that you know for certain is going to deliver the tokens per watt, that you know for certain is going to allow you -- from the moment you secure the capacity, we're going to be able to stand up an entire factory for you. We're the only company in the world that can come into your company and help you stand up an entire AI factory.
So anybody here that needs an AI factory and you need -- I'm happy to help. You call one person and that one person comes in and the next thing you know, you're in the AI factory business, okay? And so we have the expertise. We know the architecture works. We know there's enormous demand for the architecture after you're done standing it up, so we can help you get into business. And so when you're constrained that way, you have no choice but to make the best choice because your revenues next year is directly correlated to it.
And this is one of those questions now for all the CEOs that are in the cloud -- they're cloud service providers or software providers, if they make poor choices, this is no different than me choosing the wrong foundry. This is no different than me choosing the wrong memory, the wrong anything because I have so little -- everything is so constrained. If I choose poorly, my revenues are affected, everything is affected. And so they can't choose poorly.
The second thing is NVIDIA is, as you mentioned, working at such a large scale. Our supply chain, one of the things that we do with our money, of course, is to secure our supply chain. One of the things that we do with our capital is to secure supply chain so that when Satya asked me to help them stand up a few gigawatts, the answer is no problem. And the reason for that is I got all the memories, I've got all the wafers, I got all the CoWoS. I've got all the packaging, I've got all the systems, I've got all the connectors, I got all the cables. Everything from copper to multilayer ceramic capacitors, everything is secured. That's one of the reasons why NVIDIA's balance sheet being strong is so strategic.
A strong balance sheet today is not only helpful, it's strategic. And so you look at the amount of revenues we're shipping into, just look backwards and look at the amount of supply chain capacity we had to go secure for that they have to believe. If you set up a factory, a plant -- a DRAM plant, and I come in and say, you know what, go ahead and set up the DRAM plant because I'm going to use it. That goes a long ways. You might as well take that to the bank, as many of them have. And so -- and so I think the fact that everything is scarce is fantastic for us.
I think it does create duration, which I think is extraordinarily powerful for you. I think just another layer, which is the ecosystem, you're one of the great -- you are the greatest cash flow generating company in history. And then you've taken that capital and really created it, feels like stability, diversity in the ecosystem. And so how do you think about that in both a financial and a strategic context as you build, I think, both duration and durability in the entire ecosystem?
Yes. When Mark took me public, I think it was probably a little bit less energetic than I was delivering it just now. But I am fairly certain I said all the same things. NVIDIA has been building -- remember, accelerated computing requires that I build an ecosystem. You can't just take code and decompile it and it works. There's no such thing as a universal accelerated computing system. Accelerated computing is by definition proprietary. There is nothing about our architecture that is compatible with somebody else's. It's just not. The instruction set is different, the architecture is different, the micro-architecture is different, everything is different. And so we hide it underneath these things in such a way that makes you feel like and because of NVIDIA, we accelerate everything from data processing, molecular dynamics, fluid dynamics, particle systems, biology, chemicals, all the way to deep learning, right? Robotics, long sequence, spatial, 3D, you name it, right?
Sounds like a 5-layer cake?
It's a 5-layer cake, right, exactly. But because we've been working on it so long, it looks like everything is accelerated. But it's not true. It's because I did it one at a time, one domain at a time that all of the important domains in the world are now fully accelerated. And so the thing that we do on the supply chain side, our balance sheet is incredibly valuable because it provides security for our customers. On the upstream side, I'm cultivating new ecosystems for the future. All these AI natives that I'm investing in, the companies that we're partnering with, these are expanding, extending the CUDA ecosystem. 100% of everything that we do is on top of CUDA. Every investment that we've made is on top of CUDA.
So recently, there was a question about, are we going to invest $100 billion in OpenAI. We just -- just for everybody's update, we finalized our agreement. We're going to invest $30 billion in OpenAI. I think the opportunity to invest $100 billion in OpenAI is probably not in the cards. And the reason for that is because they're going to go public. And so I'm fairly sure that if we provide the capacity they need, which the compute capacity they need, which we're ramping up hard to go to, the revenues will more than follow. And they're going to go public towards the end of the year. And so this might be the last time we'll have the opportunity to invest in a consequential company like this. And our $10 billion investment in Anthropic probably will be the last as well.
And speaking of that, one of the things that I wanted to make sure I told you guys this time and something new that you probably haven't internalized. You see all the news, you probably haven't internalized, some of the really great work that we did last year, the last 1.5 years or so, last year or so, we expanded OpenAI's capacity from Azure to OCI to now AWS. We expanded OpenAI's reach of capacity to AWS. We're ramping AWS like mad. We're ramping them as hard as we can so that OpenAI has accessed even more capacity. That's one.
The second thing that we did, and this was a really, really great outcome is we're now also working with Anthropic. And in the case of Anthropic, we're expanding their capacity as aggressively as we can at AWS as well as Azure. And so notice what we're doing in both -- they used to be one and one, now they're kind of cross product. But the amount of capacity that we're going to bring online for them, supporting their revenues, their quality of revenues are so good, we just need a lot more capacity for them. So I think that this is something that -- that is somewhat new.
And of course, the third thing that happened is a brand-new AI lab flashed into the world. Isn't that right? I don't -- nobody has mentioned them. A brand-new lab came into the world and they're going to need a few million GPUs, and that's MSL. And so that MSL is a net new on top of Meta. So we've worked with Meta a long time. MSL is a net new on top of META. And so these 3 things happened, 3 new growth vectors. OpenAI at AWS; Anthropic, totally at both AWS and Azure; and MSL. And so the -- our demand profile went from being incredibly high to higher than that.
Speaking of -- so speaking of more than that. There's Waymos everywhere. I want to walk my new dog with my new robot. Physical AI could be the next place. How does that take TAM and tokens to a whole another level for NVIDIA?
That's really great. AI is all the stuff that we're doing inside the building. But obviously, ultimately, the largest industries are outside the building. And that AI needs to be -- needs to have physical awareness, physical understanding. Causality, you push a bottle, it falls over and understands gravity, understands collision, understands inertia, understand those things, okay? And understand, for example, object permanence. I take this and I put it behind my chair. In your mind, you can't see it, but you realize it hasn't disappeared, okay? So object permanence. Things like that, that affects physical behavior and physical intelligence fairly importantly.
And so -- so you probably also don't know this, that NVIDIA is the frontier of physical AI. Cosmos is the most downloaded physical AI model in the world. NVIDIA is also the frontier of autonomous AI, 2 versions, autonomous vehicle called Alpamayo, look it up, #1 downloaded. And then the next one GR00T, humanoid robotics physical AI. We are at the frontier on all 3 of those. We're also at the frontier of digital biology AI, look up La-Proteina, incredibly successful. La-Proteina for digital biology, there's a whole bunch of other models. GR00T N2 is now #1 most downloaded humanoid robotics model in the world.
And so we are at the frontier on physical AI, physics, laws of physics, multiphysics, Earth-2. We're at the frontier of physical AI. That is physical AI and AI physics. And so this whole area of physical AI, NVIDIA defines the frontier. It is completely open. We open it because we want to enable every company, new or old industry to be able to take advantage of this capability. And we've got the whole stack and the necessary computers for you to advance the AI for your own use as well as deploy it inside a robot, inside a plant, at the edge, at a radio tower, deploy it everywhere. This is the next frontier.
In 2 years' time, we're going to be largely done talking about agentic AI because we're all going to be using it. In 2 years' time, and if you invite me back again...
Every year, every year.
We're going to be talking about all these new companies. Of course, we announced a very important one, a co-innovation lab with Lilly. There will be others. But this -- in order to set up Lilly's AI factory, unless you have the capabilities of NVIDIA and this full software stack and the capabilities of all the model and the expertise in that digital biology domain, how would you even do it? And so we're quite -- the things that we are building in the next couple of years, you'll see really come to fore. And we're going to be talking about physical AI for -- starting in the next couple of 2, 3 years and for a decade.
So the speed of innovation and the pace that you're operating in is truly extraordinary. So at the beginning of the week, my partner, Joe Moore made NVIDIA his #1 pick.
Is that right?
It's his #1 pick.
Thank you.
Thank you. Good timing, Joe.
33 years later.
How do you think about the stock? Do you think about the stock? Do you have perspectives on it? You're so extraordinarily important and busy around driving all this innovation for -- in essence, everything that's going on with 3,500 attendees, and we had $40 trillion of market cap here. How do you think about that?
Well, of course, I care about the stock. I care about shareholders. I care about our employees, I care about all of you. And you might be referring to -- we just had the best earnings of earnings in the history of earnings. Is that what you're saying? I think the -- I think somebody actually told me that this might be the single best print in the history of humanity. And I said it must be only recorded humanity. I'm sure somebody had better returns. But anyways, we had a very good quarter.
Listen, you can't hold the stock back. You can't hold it back. And the reason for that is very simple. Compute equals revenues for companies. In the future, every single company will need compute for revenues. I'll just make that prediction for now. Every single company will need compute for revenues. And the reason for that is because compute translates to intelligence, which translates to your digital workforce, which translates to your revenues. I am certain compute equals revenues. I'm certain also that compute equals GDP. Therefore, every country will have it because not one country in the future will say, guess what, we're going to opt out on intelligence. We've got -- I don't know what we got, but we don't need intelligence. That's the one thing we don't need, okay? And so if you need intelligence, you're going to need digital -- you're going to need AI, you're going to need compute. And so compute equals GDP. I know that for certain.
I also know that we're at the beginning of this journey. And I see crystal clearly exactly how it's going to get funded. We know for a fact that all the CSPs took all of their CapEx and they converted to generative agentic systems, AI systems because it helps search, because it helps shopping, because it helps ads, because it helps social, because it helps literally every single Internet service in the world has been reinvented into generative AI. So they could take a 100% -- the entire Internet industry could take 100% of their CapEx and make it AI because it's better. We've proven it to be better. Meta has proven to be better. Google has proven to be better. AWS has proven to be better. And so you can now take your CapEx and convert to this.
Number two, I just said the entire software industry will be token driven. The entire software industry, you pick your favorite software company, and I can show you exactly how they're going to be token driven. And that token -- that token -- you take your favorite software company, their token will be either produced by themselves, which needs compute or they could be resold from Anthropic or OpenAI and that needs compute. And so what that says for the first time is the entire IT industry will have to be fueled by compute. That's exactly where all this is going to come from, trillions of dollars of it, and we're at the beginning of that. So that's my prediction.
Thank you, Jensen, for making history at this conference, 27 years. Thank you.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Morgan Stanley Technology
NVIDIA — Morgan Stanley Technology
Überblick
NVIDIA diskutierte im Earnings Call ein stark wachsendes AI-Compute-Umfeld und betonte, dass das Unternehmen als vollständiger Stack fungiert – von CPU/GPU bis zu Netzwerken – mit der Idee der AI-Fabriken, die Token erzeugen. Die präsentierten Kennzahlen deuten auf ein sehr großes Quartal hin: Umsatz ca. 70 Mrd. USD und Nettoeinkommen ca. 46 Mrd. USD; der CEO beschrieb das Ergebnis als eines der besten in der Geschichte.
Wichtige Kennzahlen
- Umsatz: 70 Mrd. USD; Veränderung zum Vorjahr/Vorquartal nicht genannt.
- Nettoergebnis (Net Income): 46 Mrd. USD; Veränderung zum Vorjahr/Vorquartal nicht genannt.
- Margen, EPS etc.: im Transkript nicht angegeben.
- Hinweis: Im Verlauf wird betont, dass Revenues direkt mit Compute verbunden sind; konkrete Margin- oder EPS-Zahlen werden nicht angegeben.
Strategische Ausrichtung
- Full-Stack-Strategie: NVIDIA betont, dass das Unternehmen das gesamte System-Ökosystem besitzt (Architektur, Chips, Bibliotheken, NVLink, Spectrum-X) und dadurch jährlich neue Stack-/System-Architekturen liefern kann.
- AI-Fabriken: Compute-Effizienz (Tokens pro Watt) und die Fähigkeit, AI-Fabriken zu bauen, stehen im Zentrum der Wachstumsstrategie; Unternehmen würden Revenues direkt durch Compute generieren.
- Wachstumsfelder: OpenAI, Anthropic und MSL als neue Wachstumsvektoren; OpenAI-Kapazität wird über AWS/Azure erweitert; Anthropic ebenfalls auf AWS und Azure; MSL als neuer Partner auf Meta-Ebene.
- Ökosystem und CUDA: Kurs auf eine breite CUDA-Ökologie; Investitionen erfolgen ausschließlich auf Basis von CUDA-Architekturen; OpenAI/Athropic-Initiativen stärken das Partner-Ökosystem.
Ausblick & Guidance
Es wurden keine konkreten numerischen Guidance-Zahlen genannt. Der Fokus liegt auf der weiteren Skalierung der AI-Infrastruktur (“Factories”) und der zunehmenden Token-Nachfrage. Risiken/Chancen thematisiert: Verfügbarkeit von Speicher, Energie-/Netzkapazitäten sowie Lieferkette; NVIDIA betont die Bedeutung eines starken Gleichgewichts zwischen Kapitalressourcen und der Sicherung der Supply Chain. Der Vorstand hebt hervor, dass compute to revenues (und damit auch GDP) zunehmend fundamental sind.
Analystenfragen
- Frage: Wie wird das CapEx-Finanzierungskonzept für die AI-Fabriken gestaltet? Welche Rolle spielen Tokens, Compute-Effizienz und die Kapitalstruktur?
Antwort: Jensen erläutert das “Factory”-Modell: Datenzentren produzieren Tokens; Revenues sind direkt korreliert mit Compute. Er betont, dass die Tokens-per-Watt-Dimension entscheidend ist, da höhere Tokens pro Watt zu höheren Revenues pro Plant führen. NVIDIA könne dabei helfen, AI-Fabriken standortunabhängig zu realisieren, und die Balance-Sheet-Sicherheit dient der Finanzierung. - Frage: Wie entwickeln sich OpenAI/Anthropic/MSL-Kapazitäten und Finanzierung? Welche konkreten Schritte plant NVIDIA?
Antwort: OpenAI-Kapazität wird auf AWS/Azure erweitert; Anthropic wird ebenfalls stark auf AWS/Azure ausgebaut; MSL wird als net new zusätzlich zu Meta eingesetzt. Insgesamt betont Jensen drei neue Wachstumsvektoren und eine deutliche Erhöhung der Nachfrage nach Kapazitäten. - Frage: Welche Rolle spielt Physical AI für das zukünftige TAM? Wie sieht NVIDIA hier die Entwicklung?
Antwort: Jensen positioniert NVIDIA als Frontier bei Physical AI (Cosmos, Alpamayo, GR00T N2, La-Proteina) und sieht damit bedeutende Token-Nachfrage jenseits der klassischen Software-Center. Er beschreibt außerdem, wie physische KI in Robotik, Edge-Umgebungen und Industrieanwendungen breitere Token-Märkte antreiben wird.
NVIDIA — Q4 2026 Earnings Call
1. Management Discussion
Good afternoon. My name is Sarah, and I will be your conference operator today. At this time, I would like to welcome everyone to NVIDIA's Fourth Quarter Earnings Call. [Operator Instructions]. Toshiya Hari, you may begin your conference.
2. Question Answer
Thank you. Good afternoon, everyone, and welcome to NVIDIA's conference call for the fourth quarter of fiscal 2026. With me today from NVIDIA are Jensen Huang, President and Chief Executive Officer, and Colette Kress, Executive Vice President and Chief Financial Officer. Our call is being webcast live on NVIDIA's Investor Relations website. The webcast will be available for replay until the conference call to discuss our financial results for the first quarter of fiscal 2027. The content of today's call is NVIDIA's property. It can't be reproduced or transcribed without prior written consent.
During this call, we may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties our actual results may differ materially. For a discussion of factors that could affect our future financial results and business, please refer to the disclosure in today's earnings release, our most recent Forms 10-K and 10-Q and the reports that we may file on Form 8-K with the Securities and Exchange Commission. All our statements are made as of today, February 25, 2026, based on information currently available to us. Except as required by law, we assume no obligation to update any such statements.
During this call, we will discuss non-GAAP financial measures. You can find a reconciliation of these non-GAAP financial measures to GAAP financial measures in our CFO commentary, which is posted on our website. With that, let me turn the call over to Colette.
Thanks, Toshiya. We delivered another outstanding quarter with record revenue, operating income and free cash flow. Total revenue of $68 billion was up 73% year-over-year, accelerating from Q3. Growth on a sequential basis was also a record as we added $11 billion in Data Center revenue across a diverse and expanding set of customers, including cloud providers, hyperscalers, AI model makers, enterprises and sovereign nations. Demand for our Blackwell architecture, extreme co-design at data center scale continues to strengthen as inference deployments grow in addition to training. The transition to accelerated computing and the infusion of AI across existing hyperscale workloads continue to fuel our growth.
Agentic and physical AI applications built on increasingly smarter and multimodal models are beginning to drive our financial performance. On a full year basis, Data Center generated revenue of $194 billion, up 68% year-over-year. We have now scaled our Data Center business by nearly 13x since the emergence ChatGPT in fiscal 2023. We look ahead, we expect sequential revenue growth throughout calendar 2026. And exceeding what was included in the $500 billion Blackwell and Rubin revenue opportunity we shared last year. We believe we have inventory and supply commitments in place to address future demand, including shipments extending into calendar 2027. Every data center is power-constrained. Customers make critical architectural decisions based on performance per watt given these constraints and the need to maximize AI factory revenue. Semi analysis declared NVIDIA and Inference King as recent results from Inference-X reinforced our inference leadership with GB300-NVL72, achieving up to 50x performance per watt and 35x lower cost per token compared with [indiscernible], and continuous optimization of CUDA software helped deliver up to 5x better performance on GB200/NVL72 just within 4 months. NVIDIA produces the lowest cost per token and data centers running on NVIDIA generate the highest revenues.
Our pace of innovation, particularly at our scale is unmatched, fueled by an annual R&D budget approaching $20 billion and our ability to extreme codesign across compute and networking across chips, systems, algorithms and softwares, we intend to deliver ex factor leaps and performance per watt average generation and extend our leadership position over the long term. Q4 data center revenue of $62 billion increased 75% year-over-year and 22% sequentially, driven primarily by sustained strength in Blackwell and the Blackwell Ultra ramp. With NVIDIA infrastructure in high demand, even hopper and much of the 6-year-old Ampere based products are sold out in the cloud.
Nearly a year has passed since the release of our Grace Blackwell NVLink 72 systems.
Today, nearly 9 gigawatts of infrastructure on Blackwell are deployed and consumed by the major cloud service providers, hyperscalers, AI model makers and enterprises. Networking, a cornerstone of our data center scale infrastructure offering, was a standout this quarter, generating $11 billion in revenue, up more than 3.5x year-over-year. Demand for our scale-up and scale-out technologies reached record levels, both growing double digits sequentially, driven by strong adoption of NVLink, Spectrum-X Ethernet and InfiniBand. On a year-over-year basis, growth was driven primarily by NVLink 72 scale-up switches as Grace Blackwell systems accounted for roughly 2/3 of data center revenue in the quarter. NVLink scale-up fabric has revolutionized computing and demonstrates the power of extreme co-design across all of the chips of the supercomputer and the full stack.
In Q4, we announced that we will enable AWS with NVLink to integrate with their custom silicon. Momentum is strong with our Spectrum-X Ethernet scale up and scale across networking as customers work to unify distributed data centers into integrated gigascale AI factories. For the full year, our networking business exceeded $31 billion in revenue, up more than 10x compared to fiscal [ 202 ]5, the year we acquired Mellanox. Our demand profile is broad, diverse and expanding beyond just chatbots. First, there is a fundamental platform shift from classical machine learning to generative AI. Strong evidence of ROI as hyperscalers upgrade massive traditional workloads to generative AI, including search, ad generation and content recommender systems is encouraging our largest customers to accelerate their capital spending. For example, at Meta, advancements in their GEM model drove a 3.5% increase in ad clicks on Facebook and more than 1% gain in conversations on Instagram, translating into meaningful revenue growth.
With the same NVIDIA infrastructure, Meta Super Intelligence Labs can train and deploy their frontier aggentic AI systems. Frontier AgEX systems have reached an inflection point. Cloud code, cloud cowork and OpenAI codecs have achieved useful intelligence. Adoption is skyrocketing and tokens are profitable, driving extreme urgency to scale up compute. Compute directly translates to intelligence and revenue growth. Analyst expectations for 2026 CapEx across the top 5 cloud providers and hyperscalers who collectively account for a little over 50% of our data center revenue are up nearly $120 billion since the start of the year and approaching $700 billion. We continue to expect the transition of classic data center workloads to GPU-accelerated computing and the use of AI to enhance today's hyperscale workloads and contribute toward roughly half of our long-term opportunity. Every country will build and operate some parts of its AI infrastructure, just like with electricity and Internet today. In fiscal year 2026, our sovereign AI business more than tripled year-over-year and over $30 billion, driven primarily by customers based in Canada, France, the Netherlands, Singapore and the U.K. Over the long run, we expect our sovereign opportunity to grow at
least in line with the AI infrastructure market as countries spend on AI proportional to their GDP. While small amounts of H200 products for China-based customers were approved by the U.S. government, we have yet to generate any revenue. And we do not know whether any imports will be allowed into China.
Our competitors in China bolstered by recent IPOs are making progress and have the potential to disrupt the structure of the global AI industry over the long term. To sustain its leadership position in AI compute [indiscernible] must engage every developer and be the platform for choice for every commercial business, including those in China. We will continue to engage with the U.S. and China government and advocate for America's ability to compete around the world.
We unveiled the Rubin platform last month at CES comprised of 6 new chips, the Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPUs, and Spectrum-6 Ethernet switch. The platform will train MOE models with 1/4 number of GPUs reduce inference token costs by up to 10x compared to Blackwell. We shipped our first Vera Rubin samples to customers earlier this week, and we remain on track to commence production shipments in the second half of the year. Based on its modular cable-free trade design, Rubin will deliver improved resiliency and serviceability relative to Blackwell. We expect every cloud model builder to deploy Vera Rubin.
Moving to gaming. Gaming revenue of $3.7 billion increased 47% year-on-year, driven by strong Blackwell demand and improved supply. GeForce RTX is the leading platform for PC gamers, creators and developers. In Q4, we added several new technologies and advancements, including DLSS 4.5, which uses AI to bring game visuals to a new level. G-SYNC Pulsar, bringing incredible clear graphics even in motion, and 35% faster LLM inference across leading AIPC frameworks. Looking ahead, while end demand for our products remain strong and channel inventory levels are healthy, we expect supply constraints to be the headwind to Gaming in Q1 and beyond.
For Professional Visualization, it crossed the $1 billion mark for the first time, with revenue of $1.3 billion, up 159% year-over-year and 74% sequentially. During the quarter, we launched the RTX Pro 5000 Blackwell Workstation with 72 gigabites of fast memory for AI [indiscernible] AI developers running LMs and agentic workflows.
Automotive revenue of $604 million was up year-over-year and was driven by robust demand for self-driving solutions. At CES, we introduced Alpamayo, the world's first open portfolio of reasoning-vision-language-action models, simulation blueprints and data sets, enabling vehicles that can think. The first passenger car featuring Alpamyo built on NVIDIA drive, will be on the road soon in the new Mercedes-Benz CLA. Physical AI is here having already contributed north of $6 billion in NVIDIA revenue in fiscal year 2026. Robotaxi rides are growing exponentially with commercial fleets from Waymo, Tesla, Uber, We Ride and Zukes, and many others are expected to scale from thousands of vehicles in 2025 to millions over the next decade, creating a market poised to generate hundreds of billions of dollars of revenue.
This expansion will demand orders of magnitude more compute with every major OEM and service provider developing on NVIDIA's platform. We continue to advance robotics development. With the new NVIDIA Cosmos and Isaac Group, open models, frameworks and NVIDIA's Howard Robots and autonomous machines for leading [indiscernible] companies, including Boston Dynamics Caterpillar, Franca Robotics, LG Electronics and Neuro Robotics. To accelerate industrial physical AI adoption, we also announced new expanding partnerships with Dassault Systemes, Siemens and Synopsis to bring NVIDIA AI infrastructure Omniverse digital twins, World models and CUDA-X libraries to millions of researchers, designers and engineers building the world's industries.
Let's move to the rest of the P&L. GAAP gross margin was 75% and non-GAAP gross margin was 75.2%, increasing sequentially as Blackwell continue to ramp. GAAP operating expenses were up 16% sequentially and up 21% on a non-GAAP basis related to new product introductions and compute and infrastructure costs. Non-GAAP effective tax rate for the fourth quarter was 15.4%, below our outlook for the quarter, primarily due to the impact of a onetime tax benefit. Inventory grew 8% quarter-over-quarter, while purchase commitments also increased significantly, and we have strategically secured inventory and capacity to meet demand beyond the next several quarters. This is further out in time than usual and reflects the longer demand visibility we have.
While we expect tightness in the supply for our advanced architectures to persist, we remain confident in our ability to capitalize on the growth opportunity ahead with our scale, expansive supply chain and the long-standing partnerships continuing to serve us well.
We generated free cash flow of $35 billion in Q4 and $97 billion in fiscal year 2026. For the year, we returned $41 billion or 43% of free cash flow to our shareholders in the form of share repurchases and dividends. We continue to invest in technology and our ecosystem to cultivate market development, drive long-term growth and ultimately yield total shareholder returns superior to the market or our peer group.
Importantly, we will continue to run a strategic and disciplined process as it relates to our investments and we remain committed to returning capital to our shareholders.
Let me turn to the outlook for the first quarter. Starting this quarter, we will be including stock-based compensation expense in our non-GAAP results. Stock-based compensation is a foundational component of our compensation program to attract and retain world-class talent. Let me first start with revenue. So revenue is expected to be $78 billion, plus or minus 2%. We expect most of our growth to be driven by Data Center. Consistent with last quarter, we are not assuming any Data Center Compute revenue from China in our outlook. GAAP and non-GAAP gross margins are expected to be 74.9% and 75%, respectively, plus or minus 50 basis points. For the full year, we continue to see gross margins in the mid-70s. We will keep you updated on our progress as we prepare for the Vera Rubin transition.
GAAP and non-GAAP operating expenses are expected to be approximately $7.7 billion and $7.5 billion, respectively, including stock-based compensation expense of $1.9 billion. For the full year, we expect non-GAAP operating expenses to grow in the low 40s on a year-over-year basis as we continue to invest in our expanding opportunity set. For the full year fiscal year '27, we expect GAAP and non-GAAP tax rates to be in between 7% and 19%, excluding any discrete items and material changes to our tax environment.
With that, let me turn the call over to Jensen. I think he has a few words for us.
This quarter, we significantly deepened and expanded our partnerships with leading frontier model makers. We recently celebrated OpenAI's launch of GPT-5.3-Codex codecs trained with and inferencing on Grace Blackwell NVLink 72 systems. GPT-5.3-Codex can take on long running tasks that involve research, tool use and complex execution. 5.3-Codex is deployed broadly inside NVIDIA. Our engineers love it.
We continue to work with OpenAI toward a partnership agreement and believe we are close. We are thrilled with our ongoing partnership with OpenAI, a once-in-a-generation company we've had the pleasure of partnering with since their first days. Meta SuperIntelligence Labs is scaling up at lightning speed. Last week, we announced that Meta is deploying millions of Blackwells and Rubin GPUs. NVIDIA CPUs and Spectrum-X Ethernet for training and inference.
This quarter, we announced a partnership with Anthropic and a $10 billion investment in their company. Anthropic will train an inference on Grace Blackwell and Vera Rubin system. Anthropic's Claude Code Work agent platform is revolutionary and has opened up floodgates for enterprise AI adoption. Between Claude Code Work and OpenClaw, compute demand is skyrocketing and chatGPT moment of agentic AI has arrived. With partnership spanning Anthropic, Meta, OpenAI and xAI, NVIDIA deployed across every cloud and with our ability to build full stack AI infrastructure from the ground up, or support them in the cloud. We're uniquely positioned to partner with frontier model builders at every stage, training, inference and AI factory scale out.
Finally, we recently entered into a nonexclusive licensing agreement with Grok for its low latency inference technology and welcome the team of brilliant engineers to NVIDIA. As we did with [ Mellanox ] we will extend NVIDIA's architecture with Grok innovations to enable new levels of AI infrastructure performance and value. We look forward to sharing more at GTC next month. Okay, back to you.
We will now transition to Q&A. Operator, please poll for questions.
[Operator Instructions] Your first question comes from Vivek Arya with Bank of America Securities.
I think you mentioned that you now have growth visibility into calendar '27 also, and I think your purchase commitments kind of reflect that confidence.
But Jensen, I'm curious, when you look at your top cloud customers, cloud CapEx close to $700 billion this year, many investors are concerned that it would be harder for this level to grow into next year. And for several of them, their cash flow generation capability is also getting compressed. So I know you're very confident about your road map, right, and your purchase commitments and whatnot, but how confident are you about your customers' ability to continue to grow their CapEx? And if their CapEx doesn't grow can NVIDIA still find a way to grow in that envelope?
I am confident in their cash flow growing. And the reason for that is very simple. We have now seen the inflection of agentic AI and the usefulness of agents across the world and enterprises everywhere. You're seeing incredible compute demand because of it. In this new world of AI, compute is revenues. Well, compute, there's no way to generate tokens. Without tokens, there's no way to grow revenues. So in this new world of AI compute equals revenues. And I am certain that at this point with the productive use of Codex and Claude Code and the excitement around Claude CoWork and just the incredible enthusiasm about OpenClaw and the enterprise versions of them. All of the enterprise ISVs who are now working on agentic systems on top of their tools platforms.
I am certain at this point that we are at the inflection point, we've reached the inflection point and we're generating profitable tokens that are productive for customers and profitable for the cloud service providers. And so the simple logic of it, the simple way to think about it, is computing has changed. What used to be software running on computers, modest amount of computers, call it, $300 billion or $400 billion worth of CapEx each year has now gone into AI and AI in order to have -- in order to generate tokens, you need compute capacity. And that translates directly to growth and that translates directly to revenues.
Your next question comes from Joe Moore with Morgan Stanley.
Congratulations on the numbers.You talked about some of the strategic investments that you've made into Anthropic and potentially OpenAI [indiscernible] as well but also partners, Intel, Nokia, Synopsis. You're clearly at the center of everything. Can you talk about the role of those investments? And kind of how do you view the balance sheet as a tool to kind of grow the NVIDIA's position in the ecosystem and participate in that growth?
As you know, fundamentally, at the core of everything NVIDIA is our ecosystem. That's what everybody loves about our business. The richness of our ecosystem, just about every start-up in the world is working on NVIDIA's ecosystem -- on NVIDIA's platform. We're in every cloud. We're in every on-prem data center. We're all over the world's edge and robotic systems. Thousands of AI natives are built on top of NVIDIA. We want to take the great opportunity that we have as we're in the beginning of this new computing era, this new computing platform shift to put everybody on NVIDIA. Everything is already built on CUDA, and so we're starting from a really terrific starting point.
But as we build out the entire AI ecosystem, whether it's an AI for language or physical AI or AI physics or biology or robotics for manufacturing, we want all of these ecosystems to be built on top of NVIDIA. And this is such a wonderful opportunity for us to invest into the ecosystem across the entire stack. Our ecosystem is also richer today than it used to be.
We used to be largely a computing platform on GPUs, but now we're computing AI infrastructure company, and we have computing platforms on, well, every aspect of that. and everything from computing to AI models to networking to our DPU, all of that has computing stacks on top of it. And as I mentioned before, whether it's an enterprise or in manufacturing, industrial or science or robotics, each one of these ecosystems have different stacks. And we want to make sure that we continue to invest into our ecosystem. So our investments are focused very squarely, strategically on expanding and deepening our ecosystem reach.
Your next question comes from Harlan Sur with JPMorgan.
Networking continues to rise as a percentage of your overall Data Center profile, right? Through fiscal '26, your networking revenues accelerated on a year-over-year basis every single quarter, right, with 3.6x growth, as you guys mentioned, year-over-year growth in Q4. Obviously, on the strength of your scale-up and scale-out networking product portfolio, we seem to remember that first half of last year, your annualized run rate on your Spectrum-X Ethernet Switching platform was around $10 billion annualized. It looks like that may have stepped up to around $11 billion, $12 billion in the second half of last year. Jensen, looking at your order book, especially with Spectrum-XGS, upcoming 102T Spectrum 6 Switching platforms launching soon. What is the Spectrum run rate trending now and as you foresee exiting sort of this calendar year?
Yes. As you know, we see ourselves as an AI infrastructure company. And the AI computing infrastructure includes CPUs, GPUs, and we invented NVLink to scale up -- one computing node into a giant computing rack. We invented the idea of a rack scale computer. We don't ship nodes of computers. We ship racks of computers. And those -- the NVLink Switch scale-up system is then scaled-out using Spectrum-X and [indiscernible]. We support both. And then further, we also scale across data centers using Spectrum X scale-across. And so the way we think about networking is really an extension, we offer everything openly so that people could decide to mix and manage in different scale and however they would like to integrate it into their bespoke data center. But in the final analysis, it's all one big part of our platform.
And the invention of NVLink really turbocharged our Networking business. Every rack comes with 9 nodes of Switches. And each of them has 2 chips in it. And in the future, they'll have more. And so the amount of switching that we do per rack is really quite incredible. And we're also now the largest networking company in the world and if you look at Ethernet, we came into the Ethernet market about a couple of years ago into Ethernet Switching. And I think that we're probably the largest Ethernet networking company in the world today and surely will be soon.
And so Spectrum X Ethernet has been a home run for us. But we're open to however people want to do networking. Some people just really love the low latency and the scale-up capability of InfiniBand and we'll continue to support that, of course. And to some people love to integrate their networking across their data center based on Ethernet, and we created an Ethernet capability that extends Ethernet with artificial intelligence, a way of processing in the data center, and we're incredibly good at that. And our Spectrum-X performance really shows it. The difference of when you built a $10 billion or $20 billion AI factory, the difference of 10%, and it could be easily 20% on the effectiveness and the utilization of your network for your data center, that translates to real money. And so NVIDIA's Networking business is really growing fast. And I think it's just because because we built the AI infrastructure so effectively and the AI infrastructure business is growing incredibly fast.
Your next question comes from CJ Muse with Cantor Fitzgerald.
I guess with CPX for large context Windows and Grok likely adding a decode-specific solution. Curious how we should think about your future road map? Should we be thinking about customized silicon either by workload or customer as an increasing focus by NVIDIA particularly helped by your move to dilate architecture?
We don't use -- we want to everybody should want to extend, push out [indiscernible] as long as they can. And the reason for that is because every time you cross a [indiscernible], you have [indiscernible], you have to cross an interface. Every time you cross an interface, you add latency, you add power unnecessarily. We're not allergic to [ dialets ]. We use dilates already, but we try to use dilates only when we absolutely have no choice but to do so. And so we -- if you look at the Grace Blackwall architecture and the Rubin architecture, we use 2 giant reticle-limited dies and we have [indiscernible] and that reduces the amount of architecture crossing. The [indiscernible] shows up in the architecture effectiveness of the competitors. If you look at NVIDIA, people call it our software advantage, but where software starts and architecture starts and ends, it's kind of hard to tell. It's -- our software is effective because our architecture is so good.
And so the CUDA architecture is unquestionably more effective, more efficient, delivers more performance per flop, per watt than any computing architecture out there, and it's because of the way we architect. With respect to how we think about Grok and the low latency decoder, I've got some great ideas that I'd like to share with you at GTC. But the simple idea is that our infrastructure is incredibly versatile because of CUDA, and we're going to continue to do that. All of our GPUs are architecturally compatible which means that when I'm working on optimizing models today for Blackwell, all of that work and all that dedication to optimizing software stacks and new models also benefit Hopper and also benefit Ampere. It's the reason why A100 continues to feel fresh and continues to stay performing years after we've deployed it into the world. Architecture compatibility allows us to do that. It allows us to invest enormously in software engineering and optimization knowing that our entire installed base in the cloud, on-prem everywhere from generations of architectures GPUs will all benefit.
And so we'll continue to do that, and allows us to extend the useful life, allows us to have innovation, flexibility and velocity, which translates to performance and very importantly, performance per dollar and performance per watt for our customers. And so what we'll do with Grok is you'll come to see GTC, but what we'll do is we'll extend our architecture with Grok as an accelerator in very much the way that we extended NVIDIA's architecture with Mellanox.
The next question comes from Stacy Rasgon with Bernstein Research.
Colette, I wanted to dig a little bit into the call for sequential growth through the year. So I mean, you grew this quarter more than $10 billion sequentially in Data Center and the guidance seems to imply the bulk of the increased $10 billion sequential in Data Center and so on. How do you see that as we go through the year, especially as Rubin ramps into the back half? Blackwell as been a pretty massive acceleration for sequential growth. Should we expect something similar as we get to Rubin.
And then I was also just hoping you could comment on your expectations for Gaming. I understand the memory issues and everything else. Do you think gaming can still grow year-over-year in fiscal '27? Or will that be under more pressure given memory? So those two questions, please.
Thanks, Stacy. Let me start with the revenue going forward. Again, we're trying to look at revenue quarter-by-quarter. As you think about the full year, we are absolutely going to be still selling and providing Blackwell, probably at the same time that we're also seeing Vera Rubin come to market. This is a very great architecture that helps them just today quickly standing up and have already planned on many different orders across the different customers to provide that. It's too early yet to determine how much in terms of that Vera Rubin, that beginning ramp will start in the second half, and we'll get through it. But no no confusion in terms of the strong demand and the interest. We do expect pretty much every single customer to be purchasing Vera Rubin.
The question is how soon are we in market and how soon are they able to stand that up in terms of in their data centers. That was your first part. The second part was focusing on our gaming. As much as we would love to have additional more supply, we do believe for a couple of quarters, it is going to be very tight. If things improve by the end of the year, there is an opportunity to think about what that is from a year-over-year growth. But it's still too early for us to know at this time, and we'll get back to you as soon as we can.
Your next question comes from Atif Malik with Citi.
Jensen, I'm curious if you can touch on the importance of CUDA as now more of the investment dollars in AI are coming from inference workloads.
Without CUDA, we wouldn't know what to do with inference. The entire stack from TensorRT LLM that we introduced a few years ago, which is still the most performant inference stack in the world, optimizing it for NVLink requires us to discover and invent new parallelization algorithms that sits on top of CUDA to distribute the workload and the inferencing to take advantage of the aggregate bandwidth across NVLink 72.
NVLink 72 has enabled us to deliver generationally 50x more performance per watt. It's just an incredible leap. And it's sensible. NVLink 72 is a great invention. It was hard to do. The creation of the switching technology, disaggregating the switches, building the system racks, all of that, we did it all in plain sight and everybody knew how hard it was for us to do. And -- but the results are incredible. So performance per watt is 50x, performance per dollar 35x. And so the leap in inference is incredible. It's very important -- it's really important to realize that inference equals revenues now for our customers. Because agents are generating so many tokens, and the results are so effective. When the agents are coding, it's off generating thousands, tens of thousands, hundreds of thousands because they're running for minutes to hours. And so these systems, these agentic systems are spaning off different agents, working as a team. The number of tokens that are being generated is really, really gone exponential.
And so we need to inference at a much higher speed. And when you're inferencing at a much higher speed and each one of those tokens are dollarized, it directly translates into revenues. And so inference equals -- inference performance equals revenues for our customers. For the data centers, inference tokens per watt translates directly to the revenues of the CSPs. And the reason for that -- it's because everybody is power limited. And so I mean, no matter how many data centers you have, each data center, 100 megawatts or 1 gigawatt has power limits. So the architecture that has the best performance per watt translates because each token, each -- the performance tokens per watt, each token is dollarized. -- tokens per watt translates to dollars per watt, which translates in a gigawatt directly to revenues.
And so you could see that every CSP understands us now, every hyperscaler understands this, that CapEx translates to compute. Compute with the right architecture translates to maximizing revenues and compute equals revenues. Without investing capacity today, without investing in compute, there cannot be revenue growth. And that, I think everybody understands. Compute equals revenues, choosing the right architecture is incredibly important. It's more than strategic now. It directly affects their earnings and choosing the right architecture, the one with the best performance per watt is literally everything.
Your next question comes from Ben Reitzes with Melius Research.
First, let me say kudos on including the stock comp in non-GAAP. I think that's a great move. But that isn't my question. My question is around gross margins and the sustainability of the mid-70s long term. Should we read into the visibility on supply being available into calendar '27 that it's sustainable until then? And then, Jensen, what about after that? Are there innovations in memory consumption you can unveil that makes us feel better about the ability to keep margins at that level for a long time?
The single most important lever of our gross margins is actually delivering generational leads to our customers. That is the single most important thing. If we could deliver generationally performance per watt that exceeds dramatically what Moore's Law can do, -- if we can deliver performance per dollar dramatically more than the cost of our systems than the price of our systems, then we can continue to sustain our gross margins. That's the simple most important concept.
Every -- the reason why we're moving so fast is because, number one, the demand for tokens in the world as a result of the inflection points that we've gone through has now -- has gone completely exponential. I think we're all seeing that to the point where even our 6-year-old GPUs in the cloud are completely consumed and the pricing is going up. And so we know that the amount of computation necessary, the amount of compute necessary for the modern way of doing software is growing exponentially. And so our strategy is to deliver an entire AI infrastructure every single year. This year, we introduced 6 new chips. Rubin next generation will do many new chips as well. And every single generation, we are committed to deliver many x factors of performance per watt and performance per dollar. And that pace and our ability to do extreme codesign allows us to deliver that value and that benefit to the customers. And that is the single most vital thing as it relates to our value delivered.
Your next question comes from Antoine Chkaiban with New Street Research.
I'd like to ask about space data centers, which some of your customers are considering. How feasible do you think that is on what kind of horizon? And what do the economics look like today? And how do you think that could evolve over time?
Well, the economics are poor today, but it's going to improve over time. As you know, the way that space works is radically different than how it works down here. There's an abundance of energy, but solar panels are large, but there's plenty of space in space. The heat dissipation, it's cold in space. However, there's no airflow. And so the only way to dissipate he is through conduction and the radiators that you need to create are fairly large. Liquid cooling is obviously out of the question because it's kind of -- it's heavy and freezes. And so the methods that we use here on earth are a little different than the way we would do it in space. But there are many different computing problems that really wants to be done in space. And so NVIDIA is already the world's first GPU in space, hoppers in space. And one of the best use cases of GPUs in space is imaging to be able to image at extremely high resolutions using, of course, optics and artificial intelligence.
And to be able to do that computation of reprojection of different angles and be able to up res and do noise reduction and just be able to see -- be able to image at very large -- very high resolutions, extremely large scales and very, very fast. It's hard to do that by sending petabytes and petabytes of imaging data back here on earth and doing that work. It's easier just to do it out in space. And then ignore all of the data collected and processed until you see something interesting. And so artificial intelligence and space will have very good, very interesting applications.
Your next question comes from Mark Lipacis with Evercore ISI.
I want to pick up on the comment you made on the script about revenue diversification. I believe, Colette, you said that hyperscalers were over 50% of revenues, but growth was led by the rest of your data center customers. And as a clarification, I just want to make sure I understood that does that imply your non-hyperscale customers grew faster? And if so, what are the -- can you help us understand what are the non-hyperscalers doing different? Are they doing different things than the hyperscalers? Are the same things on a different scale? And does this -- do you expect this trend to continue? Would you expect your customer base to evolve to a point where non-hyperscalers are become a bigger part of your -- the larger part of your business?
Yes. Let's see if we can help on this question. So when you think about our top 5, as we articulated as being our CSPs, our hyperscalers and they have right now added about 50% of our total revenue. There's a big organization, therefore, of diversity of all different other types of companies that we are working with, that it goes through our AI model makers that goes through our enterprises that goes to supercomputing. It goes to our sovereigns. There's a lot of other different facts on there. But you are correct. It's a very fast-growing area as well. We have a strong position in terms of all of our different cloud providers on our platform. And now we also have an extreme diversity of different customers that we are seeing all the way across the world. And this will really benefit seeing that diversity and being able to serve all of those parts. I'm going to see if Jensen wants to add to that?
Yes, this is one of the advantages that we have with our ecosystem, all build on top of CUDA. We have -- we're the only accelerated computing platform that is in every cloud that's available through every single computer maker available at the edge and we're now cultivating telecommunications. Obviously, the future radios will all be AI-driven radios and the future wireless network will also be a computing platform. That is a foregone conclusion, but somebody has to go and invent the technologies to make that possible. And we created a platform by Ariel to go do that. We're out in just about every single robot, every single self-driving car. Our ability, CUDA's ability to have the benefit of the performance of specialized processors on the one hand, with the tensor cores inside our GPUs.
On the other hand, the flexibility of CUDA allows us to solve language problems, computer vision problems, robotics problems to biology problems, physics problems and just about all kinds of AI and all kinds of computation algorithms. And so -- the diversity of our customer base is one of the greatest strengths that we have. The second thing, of course, is without our own ecosystem, even if our process was programmable, if we didn't cultivate our ecosystem and talking about some of the things that we're doing today, investing in our future ecosystem and continue to enhance our ecosystem without our ecosystem, it's hard for us to grow beyond what design wins we capture for somebody else's ecosystem. And so we could grow and expand our ecosystem very naturally because of our the platform that we created.
And then lastly, one of the things that's really important is the partnerships that we have with OpenAI and Anthropic with xAI with Meta now makes, and of course, just about every single open source in the world. There's 1.5 million AI models on Hugging Face. Not all of it runs on NVIDIA CUDA. And so -- and open source in totality probably represents the largest -- the second largest model in the world, OpenAI is the largest, second largest probably all the collection of all the open sources. And so NVIDIA's ability to run all of that makes our platform super fungible, super easy to use and really safe to invest into. And so that creates the diversity of customers and diversity of the platforms available in every single country because we support the whole world's ecosystem.
Your next question comes from Aaron Rakers with Wells Fargo.
Yes. I guess sticking with the idea of the platform and extreme codesign. Some of the news over this last quarter has obviously been NVIDIA's ability or push to bring Vera CPUs to market on a stand-alone solution basis. So I guess, Jensen, I'm curious what's the importance of Vera plays in this architecture evolution as we move forward? Is this being driven more by the proliferation or the heterogeniality of inference workloads? I'm just curious of how you see that evolving for NVIDIA, particularly on a stand-alone CPU basis?
Yes. Thanks. And I'll tell you some more about it at GTC. But at the highest level, we made fundamentally different architecture decisions about our CPUs compared to the rest of the world's CPUs. It's the only data center CPU that supports LPDDR5. It is designed to be focused on very high data processing capabilities. And the reason for that is because most of the computing problems that we're interested in are data-driven, artificial intelligence being one. And the single-threaded performance in this ratio with bandwidth is just off the charts.
And we made those architectural decisions because in the entire phase, the different phases of AI from data processing, before you even do training, you have to do data processing. So you have data processing, pre-training and in post-training now, the AIs are learning how to use tools. And the usage of tools, many of those tools run in CPU-only environments or they run in CPU with GPU-accelerated environment. And Vera was designed to be an excellent CPU for post-training. And so some of the use cases in the entire pipeline of artificial intelligence includes using a lot of CPUs. We love CPUs as well as GPUs. And when you accelerate the algorithms to the limit as we have, Amdahl's Law would suggest that you need really, really fast single-threaded CPUs, and that's the reason why we built Grace to be extraordinary to be great at single-threaded performance, and Vera is off the charts better than that.
Your next question comes from Tim Arcuri with UBS.
Colette I was wondering if you can talk about the deployment of capital. I know that you really jacked up the purchase commits, but it sounds like maybe you're over the hump on this, and you're going to probably generate about $100 billion in cash this year. So and pretty much no matter how good the results have been. The stock hasn't really gone up much. So I think it's -- that you probably feel like this is a pretty good price to be buying back a bunch of it here. So I was wondering if you can talk about that like question being, why not put a big stake in the ground and just have a huge share repo here?
So thanks for the question. We look at our capital return very, very carefully, and we do believe that one of the most important things that we can do is really supporting the extreme ecosystem that's in front of us, that stems from everywhere from our suppliers and the work that we need to do to assure that we can have the supply that's needed and help them from a capacity all the way that we are in terms of the early developers of the AI solutions that will be on our platform. So we will continue to make this a very important part of our process and strategic investments.
But of course, we are still repurchasing our stock. We are still with our dividend as well, and we will continue to find the right unique opportunities within the year for doing those different purchases.
Your final question comes from Jim Schneider with Goldman Sachs.
Jensen, you've previously outlined the potential to get to $3 billion to $4 trillion of Data Center CapEx by 2030, which implies a potential acceleration in growth rates, which you've sort of guided to at least this next quarter. The question is what are some of the key application areas that you believe are most likely to drive that inflection? Is that physical AI, agentic or something else? And do you still feel good about that $3 billion to $4 trillion envelope?
Yes. Let's back that up and just reason through it from a few different ways. So the first way is our first principles, the way that software is done in the future, using AI is token driven. And I think everybody talks about tokenomics and talks about data centers generating tokens and inference is about generating tokens and we generate tokens. We're just talking about tokens, how NVIDIA NVLink 72 enabled us to generate tokens at 50x better performance per unit energy than the previous generation. And so token generation is at the center of almost everything that relates to software in the future and it relates to computing.
If you look at the way we use computing in the past, however, the amount of computation demand for software in the past is a tiny fraction of what is necessary in the future. And AI is here, AI is not going to go back. AI is only get better from here. And so if you think about it and you said, okay, well, the world was investing about $300 billion or $400 billion a year in classical computing, and now AI is here and the amount of computation necessary is 1,000 times higher than the way we used to do computing. The computing demand is just a lot higher, and so if we continue to believe there's value in it, and we'll talk about that in a second, then the world will invest to produce that token.
And so the amount of token generation capability that the world needs is a lot more than $700 billion, and I'm fairly confident that we're going to continue to generate tokens. We're going to continue to invest in compute capacity from this point out. And fundamentally, because -- every single company depends on software. Every software will depend on AI. And so every company will produce tokens, and that's the reason why I call them AI factories. And whether you're a company in the cloud data centers, you have AI factories to generate tokens for your revenues. If you're an enterprise software company, you're going to generate tokens for the agentic systems that are on top of your tools. If you are a robotics factory and self-driving cars, first indication of that, you have huge supercomputers, which are basically AI factories to generate tokens that goes into your cars that becomes its AI. And then you also have to put computers inside the cars to continuously generate tokens.
And so we're fairly sure now that this is the future of computing. Now why is it so certain that this is the future of computing?
And the reason for that is because the way we used to do software was prerecorded, everything was captured [ in prior ]. We pre-compiled the software. We pre-write the content. We prerecord the videos. But now everything is generative in real time. And when it's generated in real time, you could take into context of the person, the situation, the query and the intentions could all be taken into consideration to generate the outcome of this new software called, we call AI, agentic AI. And so the amount of computation necessary is far, far greater than recorded. Just as a computer has a lot more computation capability than DVD recorder, a DVD player that was prerecorded, artificial intelligence needs a lot more computing capability than the way we used to do software in the past.
Now the question about computation about sustainability. At the first level is just at the computer science level, this is the way computing is going to be done. Now from an industrial level, because -- all of our companies in the final analysis are powered by software and the cloud companies are powered by software and if the new software requires tokens to be generated, and the tokens are monetized, then it stands to reason that their data center build-out directly drives their revenues. And so compute drives revenues. And I think they all understand that. I think people are increasingly starting to understand that as well.
And then lastly, the benefits that AI produces for the world ultimately has to generate revenues. And we're seeing right in front -- right being developed as we see, as we stand here, agentic AI has turned an inflection point, and it literally happened in the last couple of 2, 3 months. Of course, inside the industry, we've been seeing it for a while, probably 6 months or so. But the world is now awakened to the agentic AI inflection. The agents are super smart. They're solving real problems. Coding is obviously supported by agentic systems now and all of our coders here at NVIDIA are using agentic systems, either Claude Code or OpenAI-Codex, enormously to and oftentimes both and Cursor, oftentimes all 3, depends on the use case. But they have agents and codesign partners, engineering partners to help them solve problems. And you could see the revenue skyrocketing. These companies, in the case of Anthropic, I think their revenue is 10x in a year, and they are severely capacity constrained because the demand is just incredible, and the token demand is incredible. The token generation rate is growing exponentially. And the same thing with, of course, OpenAI, their demand is incredible. And so the more compute that they can stand online, bring online, the faster their revenues will grow.
And that goes back to the comment that I was saying that inference is revenues that compute equals revenues now in this new world. And in a lot of ways, that's the reason why we say it's a new industrial revolution. There are new factories, new infrastructure being build and this new way of doing computing is not going to go back. And so, to the extent that we believe that producing tokens is going to be the future of computing, which I believe and I think largely the industry believes, then we're going to be building out this capacity from this point forward and continue to expand from here.
Now the thing that is -- the wave that we're seeing now is the agentic AI inflection, and the next inflection beyond that is physical AI, where we take AI and these agentic systems into the physical applications, such as manufacturing, such as robotics. And so that's a giant opportunity ahead.
This concludes the question-and-answer session. I'll turn the call to Toshiya Hari.
In closing, please note Jensen will be participating in a fireside chat at the Morgan Stanley TMT Conference in San Francisco on March 4. He'll also be giving a keynote at GTC in San Jose on March 16. Our earnings call to discuss the results of our first quarter of fiscal 2027 is scheduled for May 20. Thank you for joining us today. Operator, please go ahead and close the call.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Q4 2026 Earnings Call
NVIDIA — Q4 2026 Earnings Call
📊 Quartal auf einen Blick
- Umsatz: $68 Mrd. (+73% YoY (Jahresvergleich)).
- Data Center: $62 Mrd. in Q4 (+75% YoY, +22% seq.), FY Data Center $194 Mrd. (+68% YoY).
- Bruttomarge: GAAP (nach US-GAAP) 75%, non‑GAAP 75,2% (steuerbereinigt ähnlich).
- Free Cash Flow: $35 Mrd. in Q4; $97 Mrd. im Geschäftsjahr 2026.
- Networking: $11 Mrd., >3,5x YoY – starker NVLink/Spectrum‑X‑Nachfrage
🎯 Was das Management sagt
- Inference‑Führung: Blackwell/NVLink‑72 liefert laut Management massive Leistungs‑pro‑Watt‑Vorteile (bis zu ~50x) und treibt Inferenz‑Adoption.
- Rubin‑Plattform: Rubin (Vera CPU + Rubin GPU u.a.) angekündigt; Samples versendet, Serienfertigung H2 2026 geplant; Ziel: bis zu 10x niedrigere Inferenz‑Token‑Kosten vs. Blackwell.
- Partnerschaften & Kapazität: Strategische Investments (z.B. Anthropic $10 Mrd.), breite Cloud‑Partnerschaften, Purchase‑Commitments und Inventar zur Absicherung bis in Kalender 2027.
🔭 Ausblick & Guidance
- Q1‑Umsatz: $78 Mrd. ±2%, überwiegend Data Center; im Ausblick keine Data‑Center‑Compute‑Umsätze aus China angenommen.
- Margen & Opex: GAAP Bruttomarge 74,9% / non‑GAAP 75% ±50 bp; GAAP Opex ~$7,7 Mrd., non‑GAAP ~$7,5 Mrd. inkl. $1,9 Mrd. aktienbasierter Vergütung (ab Q1 in non‑GAAP).
- Steuern: FY‑27 erwartete non‑GAAP Steuerquote 7–19% (ohne besondere Posten).
❓ Fragen der Analysten
- CapEx‑Nachhaltigkeit: Analysten hinterfragten, ob Cloud‑CapEx (Top‑5 ~50% der Data‑Center‑Umsätze) in 2027 so hoch bleibt; Management argumentiert „Compute = Revenues“ und sieht Nachfrage stabil.
- Networking‑Run‑Rate: Nachfrage nach Spectrum‑X/NVLink treibt Netzwerkumsatz; Management nennt schnellen Ausbau, konkrete Run‑Rate‑Zahlen blieben qualitativ.
- Rubin‑Ramp & Gaming: Fragen zur Timing‑ und Mengenwirkung von Vera Rubin auf Sequenzwachstum; Management bestätigte H2‑2026 Serienstart, genaue Mengen für Ramp und Gaming‑Lieferengpässe unbestimmt.
⚡ Bottom Line
- Bewertung: Call bestätigt massive, datengetriebene Data‑Center‑Wachstumsdynamik, robuste Margen und starke Cash‑Erzeugung; Rubin und NVLink‑Ökosystem sollen die Stellung weiter festigen. Risiken: Ausschluss von China‑Compute in der Guidance, zunehmender chinesischer Wettbewerb und temporäre Gaming‑Supply‑Constraints. Für Anleger: starkes Wachstum und Kapitalrückfluss, aber politische und Supply‑Risiken beobachten.
NVIDIA — Second Annual AI Summit
1. Management Discussion
So first of all, thanks, everybody, for being here for an incredibly long day. We started this thing early this morning, and we had speaker after speaker, after speaker after speaker. And then we had about a 2.5-hour break, and they came back to see you. So...
I've been up since 1:00.
So this guy -- this guy is on the tail end of a 2-week trip in 4 or 5 different cities in Asia.
One day ago, I was in Taiwan. Last night, I was in Houston. Here I am.
But he's been going 2 weeks, and we're standing between him and his personal bed versus a hotel. So we're going to have fun, and then we're going to get them out here. So -- but you don't need much of an introduction, but thank you for being here, man.
Yes. Thank you.
We really appreciate it. And...
Thanks for partnership, and really proud of you guys.
So let's start with that. We have had a partnership and you introduced this whole concept of AI factories, and we're working on this together. It's probably not going as fast as either 1 of us would like in the enterprise space. But can we start by talking about what do you -- what is an AI factory to you?
So first of all, remember, we're reinventing computing for the first time in 60 years. What used to be explicit programming, right, we wrote the programs. And the variables that's passed through APIs are very explicit to implicit programming. You now tell the computer what your intent is and it goes off and it figures out how to solve your problem. So from explicit to implicit, from general-purpose computing, basically calculation to artificial intelligence. The entire computing stack has been reinvented.
Now people talk about computing where the processing layer is, which is where we are. But remember what computing is. There's computing, there's the processing, but there's storage, networking and security. All that is being reinvented as we speak. And so the first part is we need to develop AI to a level, and we'll talk about that. We need to develop AI to a level that is useful to people. And until now, chatbots, where you give it a prompt and it figures out what to tell you, is interesting and curious but not useful.
Helps me finish crossword puzzle sometimes.
Yes. And -- but only on things that it had memorized and generalized. So if you look -- go back in the beginning of -- I mean, it's literally only 3 years ago, when ChatGPT emerged that we thought, "Oh my gosh, it's able to generate all these words. It's able to create Shakespeare." But it's all based on things that it memorized and generalized. And -- but we know that intelligence is about solving problems. And solving problems is partly about knowing what you don't know. Partly about reasoning how to solve a problem you've never seen before, breaking it down into elements that you know how to solve very easily so that in its composition that you're able to solve problems that you've never seen before. .
And to come up with a strategy, what we call plan to performing a task, ask for help, use tools, do research, so on and so forth. These are all fundamental things that now in the phraseology of agentic AI, you've heard, Isn't that right? Tool use, research, retrievable augmented generation, which is grounded on facts, memory. These are all things that all of you in the context of talking about agentic AI, you're starting to hear. But the important thing is in order to evolve from general purpose computing, which is explicit programming, we wrote it in Fortran, we wrote in C, we wrote in C++ to...
COBOL.
That's right. That's good stuff. That's good stuff. Chuck? That's good stuff.
It's my fallback job.
That's good stuff. That's good stuff. Yes. That's 1 of those skills that remains valuable.
I know.
It remains valuable.
I've got a lot of offers.
Dinosaurs are valuable forever.
We just established that you're older than me.
I know. And I'm the prehistoric. It doesn't appear so, but it's true.
All right. Pretty good.
I'm probably the old person in this room.
So how do you -- so Jensen, let's talk a little bit about like as you think about the AI...
So here we are. I went to Chuck and I said, "Hey, listen, we need to reinvent computing and Cisco has got to be a big part of it." And so we've got -- we have a new whole computing stack coming out, Vera Rubin, and Cisco is going to be time to market with us on that. And so that's the computing layer, but there's also the networking layer. And Cisco is going to integrate AI networking technology from us, but put it into the Cisco Nexus plane, control plane so that from your perspective, you're going to get all the performance of AI, but in the controllability and security and the manageability of Cisco.
We're going to do the same thing with security. And so each 1 of these pillars has to be reinvented so that enterprise computing could take advantage of it. But ultimately, and we'll come back to this, hopefully, that why is it that enterprise AI wasn't ready 3 years ago and why it is that you have no choice but to get engaged as quickly as you can, okay? Don't fall behind. I think there's -- you don't have to be the first company to take advantage of AI, but don't be the last. Yes.
So if you're an enterprise today, what's your recommendation on the first, second, third step they should take to begin to get ready?
Well, I get questions like things like ROI. And I wouldn't go there. And the reason for that is because with all technology deployments in the beginning, it's hard to put into a spreadsheet, the ROI of a new tool, a new technology. But what I would do is I would go find out what is the single most -- what is the essence of my company? What's the most impactful work that we do in our company? Don't mess around. Don't mess around with peripheral stuff. I mean, in our company, we have -- we just let 1,000 flowers bloom. The number of different AI projects in our company, it's out of control, and it's great. Notice, I just said something. It's out of control and it's great.
Innovation is not always in control. If you want to be in control, first of all, you ought to seek therapy. But second, it's an illusion. You're not in control. If you want your company to succeed, you can't control it. You want to influence it, you can't control it. And so I think, number one, Too many people want it -- too many companies I hear. They want us explicit, they want it specific. They want demonstrable ROI. And showing the value of something worth doing in the beginning is hard. But what I would do -- what I would say is that let 1,000 flowers bloom, let people experiment, let the people experiment safely. And we're experimenting with all kinds of stuff in the company. We use Anthropic. We use Codex. We use Gemini. We use everything. And when 1 of our groups says, "I'm interested in using this AI, my first answer is yes, and I'll ask why." Instead of why then yes. I say yes, then why.
And the reason for that is because I want the same thing for my company that I want for my kids, go explore life. They say they want to try something. The answer is yes. And then I say, how come? You don't go, prove it to me. Prove to me that doing this very thing is going to lead to financial success or some happiness someday. Prove to me. And until you prove it to me, I'm not going to let you do it. We never do that at home, but we do it at work. Do you know what I'm saying?
Yes.
It makes no sense to me. And so the way that we treat AI and whether it's AI or the Internet before or cloud before, just let 1,000 flowers bloom. And then at some point, you have to use your own judgment to figure out when to start curating the garden because 1,000 flowers bloom makes for a messy garden. But at some point, you have to start curating to find what's the best approach or what's the best platform so that you could put all your wood behind 1 arrow. But you don't want to put all your work behind 1 arrow too soon. You pick the wrong arrow. So let 1,000 flowers bloom. At some point, you curate.
And so I haven't started curating yet, just to put it in perspective. I've got 1,000 flowers bloom everywhere, but I encourage everybody to try. However, I know exactly what is most important to our company. Of course, I do. What is the essence of our company, what are the most important work of our company? And I make sure that I've got a lot of expertise and a lot of capability focused on using AI to revolutionize that work.
In our case, chip design, software engineering, system engineering, notice -- you might have noticed that we partnered with Synopsys and Cadence and Siemens and today Dassault so that we could insert our technology and infuse as much technology as they want, whatever they want, whatever they need, I will provide so that I could revolutionize the tools by which we use to design what we do. We use Synopsys everywhere. We use Cadence everywhere. We use Siemens everywhere. We use Dassault everywhere. I will make sure that they have 1,000% of whatever they want so that I have the tools necessary so I could create the next generation. And so that tells you something about how I -- my attitude about what's most important to me and what I will do to revolutionize my own work.
Think about what AI does. AI reduces the cost of intelligence or create the abundance of intelligence by orders of magnitude. That's another way of saying what we used to do that takes 1 unit a time, what we used to take a year could take a day now. What we used to take a year, it could take an hour. It could be done in real time. And the reason for that is because we are in the world of abundance, Moore's Law, goodness, gracious, that was slow. That's like snails. Remember, Moore's Law was 2 times every 18 months, 10 times every 5 years, 100 times every 10, okay? But where are we now? 1 million times every 10 years.
In the last 10 years, we advanced AI so far that engineers said, "Hey, guess what, why don't we just train an AI model on all of the world's data." They didn't mean let's just collect all the data from my disk drive. Let's just -- let's pull down all of the world's data, and let's train an AI model. That's the definition of abundance. The definition of abundance is you look at a problem so big, and you say, you know what, I'll do it all. I'm going to cure every field of disease. I'm not going to just do cancer. Are you kidding me? That's insane. We'll just do all of human suffering. That's abundance.
When I think about engineering, when I think about the problem these days, I just assume my technology, my tool, my instrument, my spaceship is infinitely fast. How long is it going to take for me to go to New York? I'll be there in a second. So what would I do different if I can get to New York in a second? What would I do different if something used to take a year and then now takes real time. What would I do different if something used to weigh a lot. And now it's just antigravity. And so you approach everything with that attitude. When you approach everything with that attitude, you are applying AI sensibility.
Does that make sense? For example, there are many companies that we're working with, where the graph analytics, the dependency, the relationships and dependencies that these graphs, they have so many edges, so many nodes and edges, trillions of them. Back in the old days, you would process a graph, small pieces of it. These days, just give me the whole graph. How big is it? I don't care. That sensibility is being applied everywhere. If you're not applying that sensibility, you're doing it wrong. If speed matters? Not at all. You're at the speed of light. If mass is, you're at 0 weight, 0 gravity. If you're not applying that logic, if this something is not -- it's insanely hard to you in the past, and you go, yes, it doesn't matter. If you're not applying that logic, you're not doing it right.
Now imagine you apply that logic, that sensibility to the hardest problems in your company. That's how you're going to move the needle. And that's how they all think now. The people who are -- if you're not thinking that way, all you had to is just imagine your competitor is thinking that way. If you're not thinking that way, just imagine a company who is about to get founded is thinking that way. It changes everything. And so I would go find where are the most impactful work in your company, apply infinity to it, apply 0 to it, apply the speed of light to it. And then ask, Chuck, how to make that happen?
Now let's talk about how to make that happen. So you have this analogy of...
Just call me.
I'll call you.
We'll do it together.
You have this analogy, this 5-layer cake because everybody is talking about like infrastructure models, apps...
Yes. What is AI? Yes.
I mean how do I go about it? Talk about that a little bit.
Well, the first -- 1 of the things that successful people do is they reason about what is something. What's happening here? So almost 15 years ago, an algorithm was able to -- with 2 engineers, solve a computer vision problem. Computer vision is basically the first part of intelligence, perception. Intelligence is perception, reasoning, planning. Perception. What am I -- what's going on? What's my context? Reasoning? How do I reason about? How do I compare this to my goals?
And then three, come up with a plan to solve that, to achieve that, okay? And so that's -- so for example, the jet fighter problem, perception, localization and then action. And so intelligence is about those 3 things. You can't have the second and third part without perception. You can't understand -- you can't figure out what to do without understanding context. And context is highly multimodal. Sometimes it's a PDF, sometimes it's a spreadsheet, sometimes it's information, sometimes it's just senses and smells. Where are we? What are we doing here? Who's the audience? So on and so -- reading the room, so on and so forth, right? And so that's about perception.
And so about 13, 14 years ago, we made a huge, gigantic leap in computer vision, which is the first layer of the perception problem. And it was super hard. How do you solve computer vision? And AlexNet and the first breakthrough that we saw, it was kind of like the First Contact. I love that movie, the First Contact. It was like our first contact to AI. And the thing that we did was we said, okay, what does that mean? How is it possible that 2 engineers was able to overcome the algorithms that were -- that we worked -- all of us worked on for some 30 years. And Ilya Sutskever, I talked to him yesterday and Alex Krizhevsky -- and how is it possible? Two kids with a couple of GPUs, solved this problem. What does it mean? And so we broke it all down, and I reasoned about it a decade ago.
And I came to the conclusion that, in fact, most of the hard problems in the world that can be solved, can be solved, can be solved this way. And the reason for that is most of the hard problems in the world, most of the valuable problems have no principled algorithms. There's no F equals MA. There's no Maxwell's equation. There's no Schrodinger's equation. There's no Ohm's law. There's no -- it just doesn't exist. There's no law of thermodynamics. It's not that specific. Most of the valuable things that we call intuition and wisdom, and it's all the problems that -- Chuck, the type of problems that you and I get, the answer is it depends. Do you know what I'm talking about? If it was 3, it would be great, it was 3.14, it would be fantastic, okay? Those are the great ones. But most of the hard problems in life, most of the valuable problems in life are -- it depends because it depends on the context. It depends on the circumstance, context.
And so 12 years ago, 13 years ago, something like that, yes, computer vision was solved. And so we reason that, in fact, this could be scalable because of deep learning, and you can make that models larger and larger, and there was only 1 problem we had to solve, which is how do we train that model. And the big breakthrough was self-supervised learning or unsupervised learning. So AIs that goes and learns by itself. And notice, today, we're not limited by labeling anymore. We're not even close. And so that breakthrough opened up the floodgates for us to scale these models from a few hundred parameters -- a few hundred million parameters to billions to trillions. And the amount of knowledge we can codify, the number of skills we can learn algorithmically, really largely exploded. But the basic approach was the same.
And we reason that, in fact, we're going to reinvent and which is the beginning of our conversation, we're going to reinvent computing altogether from explicit programming to a new way of doing computing where the models, the software will be learned. Now what happens -- what does that mean? If you take another step back and you go, okay, what does that mean to the computing stack? What does it mean to -- what does it mean to how you develop software?
What has happened to the engineering organization in your company? What happens to the product marketing team that specifies the product? What happens to the engineering team that codifies the product? What happens to the QA team that evaluates the product? What do these products even become someday? How do we deploy the product? How do we keep it up to date? If you're learning if it based on machine learning, how do you keep it refreshed forever? How do you patch software? And so how do you -- so on and so forth.
The number of hows I asked about the future computing, it must have been 1,000 questions. And I came to the conclusion, our company came to the conclusion that this is going to change everything. And so we pivot the whole company based on that core belief. Simplistically, what Chuck is saying is that we came from a world where everything was prerecorded. The software that Chuck worked on is...
Really good stuff. It ran a very long time just for the record.
It was -- indeed, it was described in the Hebrew.
That is true. That was another skill. I mean COBOL and Hebrew, I mean...
Chuck is the only person in the room that knows Hebrew and COBOL. And so anyways, that was prerecorded. We engineer -- we describe our, we describe our thoughts. And then we put data that goes along with it. It's -- everything is prerecorded. The reason why it's prerecorded. The reason why you know software in the past was prerecorded is because it came in a CD ROM. Isn't that right?
It was prerecorded. Okay. What is software now? Because it's contextual, and every context is different. And every time everybody who uses the software is different and every prompt is different and all the -- and the precursor you give it, the priors you give it, the context is different. Every single instance of the software is different, which is the reason why the amount of computation necessary in the past, which is prerecorded, it's called retrieval based. All you have to do is check yourself. .
When you use your phone, you touch something, it went and retrieve some software, read some file, some images and brought it to you.
In the future, everything is going to be generative, just like it's happening right now. This conversation has never happened before. The concepts existed before. The priors existed before. But every single word in the sequence has never happened before. And the reason for that is, obviously, we're 4 wines in.
COBOL and Hebrew have never come out of the -- cold brew yes, COBOL, Hebrew, no.
Thank goodness. This is not on campus.
Or being streamed. All right. Let's...
Do you understand what I'm saying? And so as a result, as a result...
Do you understand what you're saying?
The only thing that Chuck has fed me today so far is 4 glasses of wine.
And to be fair, I only fed you -- I fed you 1 of them. You took the other 3 off the buffet.
I was eyeing the food. I was like, I'm so hungry, I'm eyeing the food. It was forever about 40 feet away from me.
Because you were taking photos.
But it was -- I was like it was so close. It was so close. And I actually leaned towards the food 1 time, but I was pushed back again.
You know what, you know what happened, your team -- your team actually told us ahead of time. If you get 3 glasses of wine in, he is optimal. If you get fourth wine in, it's going to be incredible.
This is suboptimal. So anyways, listen, listen, listen. So what is AI? We had to leave some wisdom behind.
Can we get another glass of wine, please?
This is not just Dave Chappelle stuff.
Let's talk about some -- let's talk about 1 other thing.
Energy, chips.
Energy sounds good.
Energy, chips, infrastructure, both hardware and software, then the AI model, but the most important part of AI is applications. Every single country, every single company, all that layer underneath is just infrastructure stuff. What you need to do is apply the technology. For God's sakes, apply the technology. A company that uses AI will not be in peril. It's the company who -- you're not going to lose your job to AI, you're going to lose your job to someone who uses AI. So get to it. That's the most important thing. And call Chuck as soon as possible.
You call me, I'll call him. You got it. So we don't have a lot of time. So I'm not sure...
We have all the time in the world.
Do we?
Chuck, he runs -- he bills on the clock. I don't even wear a watch. Look at that, Chuck.
I got you right here. We're doing great.
You bill people on the clock. Not me. I'm not leaving until value is delivered. If it takes all night, I'm not -- look, I won't torture all of you until...
But Jensen, that's why my guys like me need a watch. All right. Can you...
Until you could say that you learned something, you are going to be trapped in here. We're going to torture everybody until value is delivered.
I did check. There is more wine. Can you just give us your top of mind on physical AI?
Remember what software is. Software is a tool. There's this notion that the tool -- the software industry is in decline, and will be replaced by AI. You could tell because there's a whole bunch of software companies whose stock prices are under a lot of pressure because somehow AI is going to replace them. It is the most illogical thing in the world, and time will prove itself. Let's just give it -- let's give ourselves the ultimate thought experiment.
Suppose we are the ultimate AI, artificial general robotics, the ultimate AI, the physical version of us. You could, of course, solve any problem because you're humanoid, you could do things. If you were a humanoid robot, would you use a screwdriver or invent a new screwdriver? I would just use one. Would you use a hammer or invent a new hammer? Would you use a chainsaw or invent a new chainsaw? It just don't -- first of all, ideally, they don't use it at all. But do you understand what I'm saying? If you were a humanoid robot, artificial general robotics, would you use tools or reinvent tools?
The answer, obviously, is to use tools. And so now do the digital version of that. If you were an artificial general intelligence, would you use the tools like ServiceNow and SAP and Cadence and Synopsys? Or would you reinvent a calculator? Of course, you would just use a calculator. That's the reason why the latest breakthroughs in AI is what? Tool use, because the tools are designed to be explicit. There are many problems in our world where F equals MA. Please -- could you please not come up with another version? FA is not kind of MA, it's just MA? Do you guys -- Ohm's law, V equals IR, is not kind of IR, approximately IR, statistically IR. It is IR. Okay? Do you understand what I'm saying?
So I think we want the artificial general robotics, artificial general intelligence to use tools. Well, that's the big idea. I think that in the next generation of physical AI, we're going to have AIs that understand the physical world, understand causality if I tip this over, it's going to tip all of that over. They understand the concept of a domino -- just the concept of domino, notice a child understands if you tip that over. The concept of domino is extremely -- it's like deeply profound. The causality, contact, gravity, mass, all of that is integrated into a domino. Tipping a domino's over.
The idea that you could have a little tiny domino, tip a larger domino, tip a larger domino, tip a larger domino to the point where there's a ton on the other side, a child has no trouble with that concept.
A large language model will have no idea. And so we have to teach. We have to create a new type of physical AI. Well, what's the opportunity? So far, the industry that Chuck and I have been part of is about creating tools. We have been in the screwdriver, hammer business. Our entire life has been about creating screwdrivers and hammers. For the first time in history, we are going to create what people call labor, but augmented labor.
Give you an example. What is a self-driving car? It's a digital chauffeur. What's a digital chauffeur valued at? A lot. A lot more than the car. And the reason for that is because in the lifetime of the digital chauffeur, the economics of the digital chauffeur is a lot more than the car. For the very first time, we are exposed to a TAM that is 100x larger, literally, mathematically, true. The IT industry is about $1 trillion, right, or so, plus or minus a couple, and yet the economy of the world is about $100 trillion.
For the very first time, we're going to be exposed to all of that. So it is the case that all of you, all of you, everybody in this room today, you have the opportunity to apply this technology to become a technology company.
Let me give you some examples. I really believe as much as I -- look, I love Disney, and I love working with Disney. I'm pretty sure they'd rather be Netflix. I love Mercedes. I came in a Mercedes. I am certain they'd rather be Tesla. I love Walmart. I am certain they'd rather be Amazon. Do you guys agree so far? Am I 3 for 3? All of you are that way.
I believe that we have an opportunity to help transform every single company into a technology company. Technology first, technology first. Technology is your superpower and the domain is your application versus the other way, which is the domain is who you are, and you're seeking for technology. .
And the reason that's so -- the reason that's so is because companies who are technology first, you're dealing with electrons, not atoms. And electrons, there's a lot more of them. Atoms, you're limited by mass, which is the reason why the moment they went from CD ROMs to electrons, the value of the company exploded by 1,000 times. You need to be like us, an electronics company, electron company, which is another way of saying a technology company. And so I think that the opportunity for you is here.
Another way to think about that is AI, and we just said it earlier, even Chuck who only knows how to program in Hebrew.
It's a gift.
His instruments choices -- and right to left because, as you know, it's otherwise.
It is pretty smart actually.
Smart people do smart things. And so the beautiful thing is that, as you know, the programming language of the world and for all of your companies, you kind of feel like, "Oh my gosh, software is not our strength, but knowledge, intuition domain expertise is your strength." Well, you get to -- you now for the first time, can explain exactly what you want to a computer in your language. Do you remember where we started from explicit programming to implicit programming?
For the first time in history, you could program a computer implicitly. Just tell it what you want. Tell it what you mean. And the computer will write the code because coding as it turns out, is just typing and typing as it turns out, is a commodity. And that's the great opportunity for you. All of you could be levitated above the atomic limitations that you were limited by before. All of you could escape from this limitation, which is we don't have enough software engineers because as it turns out, typing is a commodity. And all of you have something of great value, which is domain expertise to understand the customer, understand the problem. And that is the ultimate value. That is the ultimate value, to understand the intent.
As you know, when you graduate from software -- when you graduate from college, you could be a super programmer, but you have no idea what customers want. You have no idea what problems to solve, but that's what all of you know. You know what customers want. You know what problems to solve. The coding part of it is easy. Just tell the AI to do it. And so that's your superpower. So Chuck and I are here to enable you to do that. That closing was done with 5 glasses of wine to me, and so it's a miracle indeed between somebody who works off a tablet...
This is true representation of artificial intelligence. Or maybe that's enhanced intelligence.
I just want to tell you that it's a great pleasure working with all of you. Cisco, as you know, has extreme expertise in 2 very important pillars of the invention of computing. Without Cisco, there is no modern computing. One of them is, of course, networking and the other one is security. And those -- both of those pillars have been reinvented in the world of AI. And the part that we know very well, which is the computing part of it, in a lot of ways, is a commodity. And the stuff that Cisco knows is deeply valuable. And between the 2 of us, we're going to -- we'll be delighted to help all of you engage the world of AI.
And then somebody asked me earlier, and I just said, I think it's worth repeating. Somebody asked me earlier, should you do -- just rent the cloud? Or should you even make the effort to build your own computer?
Here's what I would tell you. I would advise you to do exactly the same thing I'd advise my children, build a computer. Even though the PC is everywhere, even though it's mature, even though the technology is developed, for God's sakes, build one. Know why all the components exist. If you were to be in the world of the automotive -- the automobile industry to transportation industry, don't just use Uber. For God's sakes, lift a hood change the oil, understand all the components. For God's sakes, understand how it works. It is vital.
This technology is so important to the future. You must have some tactile -- tactical understanding of it. Lift the hood, change the oil, build something. It doesn't have to be large, build something. You might discover you're actually insanely good at it. You might discover that you need that skill. You might discover that the world is not about all rent versus all own, that you want to rent some and own some because some part of your company should be built on-prem.
For example, sovereignty and proprietary information. And you're just -- you're not comfortable. You're not comfortable sharing your questions to everybody. You know the reason why I've never -- this is a conceptual example. You know that when you go see a therapist, you don't want the questions to be online. You know what I'm saying, okay? I'm just -- I'm imagining this one. Okay. So hypothetically I think that a lot of questions that you have, a lot of conversations, you have, a lot of dialogue, a lot of uncertainties you have ought to be kept private.
Companies are the same way. I am not confident. I am not secure about putting all of NVIDIA's conversations in the cloud, which is the reason why we build it locally. We've built a super AI system locally because I'm just not confident to share that conversation because my -- as it turns out, the most valuable IP to me is not my answers. They're my questions. Are you following me? My questions are the most valuable IP to me. What I'm thinking about are my questions. The answers are a commodity. If I simply knew what to ask, I'm identifying what's important. And I don't want people to know what I think is important. And I want that to be in a small room. I want that to be on-prem. I want that to be my myself. But I want to create my own AI and then one last thought.
Since it's already 11:00. One last thought. There was an idea that AI should always have a human in the loop. It's exactly the wrong ideas, it's backwards. Every company should have AI in the loop. And the reason for that is because we want our company to be better and more valuable and more knowledgeable every single day. We never want to go backwards. We never want to go flat. We never want to start from beginning, which means that if we have AI in the loop, it will capture our life experience. Every single employee in the future will have AI, lots of AIs in the loop. And those AIs will become the company's intellectual property. That's the future company. And therefore, I think it's sensible for all of you to call Chuck immediately.
And I'll call Jensen.
Anyhow, that's my close.
Listen, let's -- 2 weeks on the road, Jensen flew here, spent his last night -- last evening with us before he gets a sleep in his bed for the first time in a long time. We're forever grateful. Appreciate you being here. Thank you so much.
Thank you very much.
Thank you, man.
And from the corner of my eye, there were all these skewers. I hope it's still there.
Where is the bag of Fritos...
All right. Let's go. Thank you. Thank you, everybody.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Second Annual AI Summit
NVIDIA — Second Annual AI Summit
📣 Kernbotschaft
- Narrativ: KI transformiert die gesamte Rechenarchitektur von explizitem zu implizitem Programmieren. NVIDIA positioniert sich nicht nur als Chiplieferant, sondern als Anbieter eines kompletten AI‑Stacks (Rechenleistung, Storage, Netzwerk, Security) mit Fokus auf agentenfähige, anwendungsgetriebene Lösungen.
🎯 Strategische Highlights
- Cisco‑Partnership: Angekündigt wurde ein enger Integrationsplan zwischen NVIDIAs neuem Vera Rubin‑Stack und Ciscos Nexus‑Control‑Plane, Ziel: AI‑Performance plus Enterprise‑Manageability und Sicherheit.
- Tool‑Ökosystem: Kooperationen mit Synopsys, Cadence, Siemens und Dassault zur Einbettung von NVIDIA‑Technologie in Chip‑Design und Engineering‑Workflows; Schwerpunkt auf Beschleunigung interner Entwicklungsprozesse.
- Adoptionsstrategie: Empfehlung an Unternehmen: breit experimentieren ("1.000 Blumen blühen"), später kuratieren; starke Betonung auf On‑Prem‑Deployments für IP‑Souveränität.
🔍 Neue Informationen
- Neu: Primär strategische/produktseitige Aussagen (Vera Rubin, Cisco‑Integration, namentliche Partner). Keine neuen finanziellen Kennzahlen oder Guidance; die Session liefert Vision und Partnerschafts‑Details, aber keine quantitativen Zielsetzungen.
❓ Fragen der Analysten
- Eintrittsstrategie: Gefragt wurde nach konkreten Schritten für Unternehmen; Jensen empfiehlt Fokus auf das Kerngeschäft, experimentelles Ausprobieren und spätere Konsolidierung statt sofortiger ROI‑Bewertung.
- Build vs. Rent: Diskussion über Cloud‑Miete vs. Eigenbau – Management rät dazu, zumindest Teile on‑prem zu besitzen (Sovereignity/IP), ohne konkrete Kosten‑ oder Zeitangaben.
- Kritische Punkte: Es fehlten harte Zahlen: kein quantifizierter ROI, keine Zeitachsen für Produkt‑Rollouts; Antworten blieben strategisch und visionary statt operativ konkret.
⚡ Bottom Line
- Implikation: Die Präsentation stärkt NVIDIAs langfristiges Wachstumsszenario durch Ausdehnung des TAM über reine Chips hinaus (Infrastruktur + Plattformen + Partnerschaften). Kurzfristig fehlen konkrete Finanzkennzahlen; Bewertung bleibt abhängig von Tempo der Enterprise‑Adoption und On‑Prem‑Nachfrage.
NVIDIA — 44th Annual J.P. Morgan Healthcare Conference
1. Question Answer
All right. Good afternoon, and welcome to JPMorgan's 44th Annual Healthcare Conference here in San Francisco. My name is Harlan Sur. I'm the U.S. semiconductor analyst for the firm. For the seventh year, we have the team from NVIDIA presenting. As all of you know, NVIDIA is the leader in accelerated computing in AI semiconductor, software, systems, enabling the development and deployment of the world's AI foundational models, like large language models, enabling next-generation reasoning and agentic-based frameworks and now moving the industry adoption curve to physical AI and driving compute innovation for cloud, hyperscalers as well as large vertical markets like health care and life sciences.
Here with us today from NVIDIA is Kimberly Powell, Vice President and General Manager of Healthcare at NVIDIA. She's responsible for the company's worldwide health care business, including hardware and software platforms for accelerated compute and AI that power the ecosystems of imaging, genomics, life sciences, drug discovery and health care analytics. Kimberly, great to have you back again. Let me turn it over to you.
Thank you, Harlan. Thank you so much. Thank you. Good evening, everyone. This is the first time I've been between you and cocktails, but I'm going to make sure it's as entertaining as humanly possible. Just a few words that this is an absolute once-in-a-generation platform shift for the health care industry. And I am so honored to be invited back for the seventh year, we're in our 17th year of working on health care at NVIDIA. And I thank the conference very much for this opportunity, and all of the partners that we work with day in and day out, which I hope I'm going to be able to share what the future is going to look like.
Before we get into it, please have a moment and look at our forward-looking statements. And action. Okay. 2025 was an absolute breakout year for agentic AI. Many things came together in just the last 12 months, you've heard the words reasoning, AI models that can reason. You've heard the words tool use, software that can actually use tools on behalf of the human user. You've heard the word retrieval being able to attach language models with trusted information and trusted knowledge. Agentic AI is here alive and being deployed faster in health care than any other industry.
The ChatGPT moment, Jensen just described to us last week, has arrived in physical AI, the amount of progress we are now able to make in robotics because we've closed the loop in a very important domain called simulation, where robots and embodied AI will learn in a computer first before they're ever deployed to the real world is here, and it's having profound implications across the entire health care and life sciences industry. And then thirdly, and we've been working on this for some time, and many companies are here sharing with you that AI is starting to understand and learn the laws of nature, biology in particular.
You might call us at the beginning of the transformer moment of biology. Let's start, though, with thinking about what's happening in AI. Let's take a minute to understand how AI is really making such rapid progress. One of the most important things we need in the world is open models and open software. Just like you could think about Linux back in the day as an operating system that created brand-new markets. We're exactly here at this time. And what's amazing to think about is open models are now reaching the frontier, which is giving an opportunity to every startup company and actually every enterprise to participate as fully as very well-stocked AI labs around the world. Open models and reasoning models that really came into light at the beginning of 2025 are absolutely the backbone of innovation. These are models that can think and they're much more relatable to humans, and they can create the essence of transparency and explainability.
They can break down very complex tasks that otherwise were just untouchable or had to be hand coded in the past software. So 80% of startups today are built on open models and it's in a very, very important strategy to NVIDIA. Over the past several years, we have been amassing a huge body of work in the open source. In 2025, actually NVIDIA became the world's largest contributor of open-source AI on Hugging Face. We have over 650 language models that have been contributed, 250 data sets. And those are not only in language. They're also in biology, in chemistry, robotics and vision.
And so key to developing an ecosystem is to not only provide open models. When we say open models, we actually made 3 things. One is the model itself. Two are the open data sets. For any company or any industry that is regulated, you need to understand how these models came to fruition. You might even need them for auditing purpose down the road. So open models, open data set. And the third is open tools. Just like all intelligence, learning is never finished. Every single user interaction you have with your software, with your application is training data to enhance the system going forward.
So you need to create a whole tool chain for the end-to-end AI life cycle which is essentially a never-ending life cycle. So when we say open models, yes, the capability of open models are extremely important, but it comes with open dataset and the open tools. And so we've been pioneering in some very important places. We just announced the third generation of our Nemotron language models that are absolutely at the frontier for Agentic AI, we've recently announced our physics AI models, Earth-2, things that we can do in climate and weather simulation.
Our own Clara models in health care for biomedical AI that spans everything from target discovery to molecular design and medical AI reasoning. And then we're at the dawn of this ChatGPT moment for physical AI. And a lot of that physical AI has come from the foundation models we have pioneered at NVIDIA, last year's CES was actually the breakout moment for Cosmos, which is the best innovation of the show for understanding the world, world foundation model, understands the laws of physics, understand spatial awareness, can create digital worlds of all kinds with millions, billions of permutations in which robots and physical things can learn in these environments.
Group is for robotics, so that you can train robots to operate in these physical worlds, all of the different training policies, all the different tasks that needs to take us from very specialized robots to more generalized and they can complete really amazing tasks. And just last week, Alpamayo, which is for self-driving cars. The first time we've ever opened sourced. This is a model that is essentially thinking autonomous vehicle model. It has large language models at the root of it, and it's an end-to-end driving system. Incredible work that is going to lay the foundation for much of the physical AI to come.
What I also love about 2025 and us moving forward is agentic AI has become hireable. I would hire this man. I think all of you would as well. And what do I mean by that? All of those breakthroughs that I just described, the ability to reason, the ability to call the tools necessary, the ability to interact with antiquated systems, whether they're scheduling systems or otherwise, all of that has largely now become solved in the age of Agentic-AI. And so health care systems all over the globe are recognizing that they can start hiring these agentic systems and platforms essentially as digital coworkers to close this extreme gap we have in terms of health care services and the number of health care professionals we have. As you know, the World Health Organization predicts we're going to be tens of millions of health care providers short by 2030.
We can offload our amazing health care professionals who dedicate their lives to their profession and offload a lot of the clerical work that isn't necessary clinical work. I love this report from Menlo Ventures that describes that I would say, for the first time in my history and my being in this industry that health care is leading the pace at a technology, enterprise deployment and adoption. It's actually at 3x the pace of the U.S. economy. And that is because it's solving such acute challenges. And so it absolutely is here a USD 4.9 trillion market. And we are deploying AI at this incredible scale. These are paid and enterprise-grade software systems that are now being hired in the psychology of CIOs and health systems all around the globe are recognizing this as an opportunity.
It is just not possible to go out and find another 500 doctors that you can hire into your system. But by offloading your amazing doctors with systems that I'm going to share with you in just a moment, we have an incredible opportunity to do so. I've talked about this -- we've talked about this for quite some time that the way software is being built has fundamentally changed. This is, in essence, what all agentic systems and what all Software as a Service platforms will look like when they are agentic. They are prompted with some input and this amazing reasoning system kind of understands the user intent.
You will always use frontier models, and you will also augment these frontier models with specialized models because the work that gets done in industry is exquisite work, it's specialized work. There's subject matter expertise involved. And so you have to call upon a lot of different tools in order to connect these agents and connect these otherwise generalists to become specialists and deliver the value that's necessary in the industry. Let me share a couple of amazing examples. I think many of us here have heard of Abridge and actually, they're on the conference scheduled this year. Abridge is a clinical conversation AI platform.
Their platform again, looks like those systems by connecting these systems in such exquisite ways, understanding workflows in a building block sense to transform workflows. They're giving 30% or more of doctors' time back at the end of the day, helping them generic reports and prior authorizations, deployed in over 200 health systems already, and that number is growing dramatically faster even as we seek in the last 6 months. Corti is a health care agent platform that is helping Europe and the NHS deploy agents of all kinds.
And similarly, we have Speechmatics and Sully, who are creating agents to triage, creating agents to check you in, agents that can be deployed all over the hospital for, again, not necessary clinical work but amazing workflow that creates a win-win situation. It's a win situation for the health care systems because more patients can come through. And it's a win for the patient because the experience is far improved. Now these agents are going into another high stakes, very high-cost area of the industry, and that is in clinical development. This is a part of the drug discovery and medical device process that is absolutely necessary. But it's a very challenging part of the system.
It's very labor-intensive. It's very manual. It's frankly error prone. We work with amazing companies like ConcertAI, which is helping to stratify these clinical trials and even simulate their outcomes so that you can do much, much better planning, the amount of money, time and resources that can be saved and the precision to get to where these clinical trials need to go faster. CytoReason is essentially building the capability to do drug development by building disease models, using knowledge graphs and otherwise to really understand and help build better modeling with all of that real-world data.
And we've been working with IQVIA now for well over a year, and they have agentic systems being deployed from the commercial deployment of commercial teams who can give you an essence of in what region, what physicians you should call on with very relevant data so that they can have much more productive commercial teams as well as into all of the clinical trial, finding the right start-up studies and building those at a much, much faster pace than before.
So this agentic digital health ecosystem is being built on NVIDIA, our open models, our tooling and our ability to help them connect and build out these agent systems to do incredibly wonderful things. Now agents are at also a very exciting inflection point where they are accelerating science. This is the loop of science that is emerging, not only in life sciences, but particularly in life sciences, it's having a very, very accelerated effect on how we're thinking about doing science. Science -- AI scientists are agentic systems, who you could imagine, could go off and read the literature, you can go back and forth and reason with them, they can help you design the experiment. They can call upon tools that could be foundation models for protein structure prediction or do virtual screening in the digital lab or they could actually go off and kick off and experiment in a physical lab.
And you can also think about the computational dry lab as this connective glue to close this loop as we described. As you have the agentic system, you have the physical space, but you need to constantly take every experiment and build that into digital intelligence of the R&D work that's being done all over the world. There is a new emerging ecosystem of a category of AI science companies who are building on, again, NVIDIA's Nemotron, a huge additional breakthrough of last year that's come into vogue in order to create these agents that go beyond generalist understanding and into science and technology, and it's called reinforcement learning, using experimental data to reinforce these models and tune them into a very particular science task.
Edison is the new commercial company that came out of FutureHouse. This is a stunning AI scientist. This scientist can go off and read 1,500 papers, write 40,000 lines of code and synthesize a research report in about 16 hours, work 16 hours straight and essentially do the amount of work that would otherwise be 4 to 6 months by a researcher, pretty mind boggling. LiLA is building a super intelligence, a completely integrated autonomous lab and agentic system. So literally, all experiments come back and feed the super -- to feed the super intelligence and creating a complete closed loop system. And Owkin is combining biological language models with deep patient data to really help biopharma teams have a higher confidence in their decisions.
So the science agents are really here and they're making a profound impact on that -- there's really a new paradigm in science that is emerging. And as we know, life sciences is one of the largest science domains and pharma R&D is the largest of that. And so a $300 billion industry of R&D is going to be reinvented with this paradigm. I want to share with you how agents are entering the lab. They're not only going to be co scientists along with you that you're kind of talking with, working with, hypothesizing, they're creating reports for you, but they're actually going to do work for you on behalf of the scientists or with the scientists in the lab. Let's take a look.
[Presentation]
We're super excited to announce today that we're working with Thermo Fisher Scientific, the world leader in lab instrumentation and services to build what we're calling the fundamental AI infrastructure for the lab. You can see there that little gold box is essentially a bench top AI supercomputer called DGX Spark. This DGX Spark can run any workload of AI and accelerated computing. You can hold it in the palm of your hand. And so we pioneered this first agentic system where you essentially can make the instrument intelligence right there with that amazing gold box and some agents that we built together. And you can close the loop from the scientists there. Sometimes, you can close the loop right with the instrument. The instrument with an automatic quality control agent can understand, oh, I need to go clean something in the instrument and it can autonomously self-relieve it and come out with much better experimental data.
And so it is very clear that this is going to drive -- this is going to scale the throughput of labs. It's going to increase the quality of experiments and no longer are humans going to be kind of the thing that's bottlenecking the amount of data that we can come in and do science. So this is just an amazing partnership and we're delighted to be partnering with Thermo Fisher.
Now getting into physical AI, labs are one of the most chaotic and bespoke physical environments that are out there. And so we really want to think about scaling labs with robotic intelligence and into those real worlds of physical labs. And so you -- there's a journey here that we want to have not only specialized robots that really understand a given instrument, we want them to also be generalized and actually, we want the best of both worlds. And so to get to that best of both worlds, NVIDIA has created the physical AI 3 computer platform. I was describing this earlier, where you use simulation and our Cosmos World Foundation Model to create digital worlds to train these robots in, you can vary the lighting.
You can move beakers. You can practice all sorts of different tasks and train your robot in simulation first. You can use Isaac in the training platform to train all sorts of different types of robots or for tons of different tests. Sometimes they need to be contamination tasks, sometimes they have to have different perception because they're looking for barcodes or they're looking for a glass like this size or their pipetting. And then we have also the edge computer for you to go off and deploy this. So AI and lab automation is reaching far into the physical world.
And so again, a new class of companies is emerging into not only lab automation, but robotic lab automation. We're working with some fantastic companies in this space. Multiply Labs is using our Isaac platform to train their robots. They're doing amazing work in cell and gene therapy, biomanufacturing labs where they're using the Isaac system to train literally thousands of different tasks because as you're going through some of these more complex therapies, it's many, many steps involved, and these are very precise steps, the precision and you don't actually want humans involved because of the contamination aspects of it.
And so they've made some tremendous breakthroughs where, take a cell therapy that costs $100,000 to manufacture, and they're reducing that down to $30,000, over 70% with their robotic systems. And they're essentially putting 100x the throughput in a given square space, square footage lab environment. These are the breakthroughs that are going to scale medicines to where we need to go. Similarly, HighRes build complete lab automation at a very large scale, exquisite robots that are learning, again, in our environment using Isaac and Cosmos to train them and learn all of these different task to take automation to yet another level.
And then Opentrons very well known for their liquid handling, deployed in 10,000 labs around the world, again, using our platform to build the simulation environments and increase the velocity at which these robotic systems are able to tackle more and more complex tasks in the lab.
All right. We are in a final chapter here of AI starting to learn the laws of nature. I took a few headlines. The NeurIPS Conference is the flagship AI conference of the year. And there was over 30 workshops, developers of all kind. There were 50 or so biology and life science company parties at this event. And it was written up as biologies transformer moment. AI revolution and drug making is well underway now. And we're really starting to see AI-enabled medicines reach the later stage of clinical development, which is extremely exciting and a lot of those companies are here.
We are working really hard to help push the frontiers in this area of biology transformation models. We don't aspire to necessarily be a biology foundation model company. But all of the methodology, all of the challenges in which it takes to scale these models at a domain-specific level, language is short words, assembled into a sequence of a sentence that looks very different than a 3-billion-character long DNA. And so we need to think about context length when we're talking about biology. We need to have different model architectures.
These models have to get grounded in physics. So there's a lot of interesting challenges. And so we've been adding to our Clara open models to help the entire research industry really accelerate the ability to train larger models as we go here and multimodal models. And we're really proud of some of the work we're doing in La-Proteina, which allows you to add atomic scale, essentially design proteins, our very new one that we're announcing is RNAPro for RNA design, the first that we've had. And Merck was just up here. We did some exciting work with them on that KERMT model, which is all around predicting toxicity. And so we're really trying to work across the drug discovery process with these open models.
We have a reasoning model for molecular synthesis in our version 2. Really excited about these models. And so we also announced today a pretty massive extension to the NVIDIA BioNeMo platform. So not only are we investing heavily in these open models. But additionally, and as I said, it's not just about the model, it's about the data sets. And we have a road map to continue to invest in data. This industry, like other industries where it's self-driving cars will benefit from synthetically generated data. And so we've generated some synthetic proteins. We also care a lot about doing data processing, things like cheminformatics workflows like RDKit, we now have a GPU accelerated nvMolKit that is 100x faster in chemistry processing. And so this platform expansion is really, really foundational.
And as I said, it's part of that glue. It's that digital dry lab that is going to take all of the intelligence from experiments and continuously enhance these models going forward, which can then be called by the agentic systems as their tools. So what's exciting is we're seeing enterprise adoption of BioNeMo in some pretty exciting platforms. Basecamp Research is an AI-native company who announced their EDEN platform today here at this conference. This is a GPT 4 size biology model that was trained on 10 trillion biology tokens.
And it's able to now do things what they're calling gene insertion and their lab result -- their validation results of what it's able to do in antimicrobials and in cancer is really pretty groundbreaking, amazing EDEN platform. We're working with Natera, who is for cell-free DNA, they're training their own models and then also building into their platform, agentic systems to advise in clinical development and advise in the clinical decision support. And then TetraScience is a scientific data platform, again, to try to connect all of these.
As we know, science is done oftentimes in very narrow. But when you can try to now start to learn across many different data sets, to ask all of these scientific questions, we're working with them. They're deploying BioNeMo models inside. They're deploying Nemotron, and it's an amazing platform for scientist and so this is the vision. This is the new paradigm in science. And as I said, there's an amazing group of new markets, new companies that are being built all around this vision, of AI scientists that can call tools that are constantly getting smarter by the experiments.
And these experiments have automation that will come from the agents setting it up, but you're still going to need robotics and otherwise to help you execute that in the physical world. To bring this whole vision together, we are so excited to announce an extension to our partnership with Lilly. Today, we announced a first of its kind co-innovation AI lab with Lilly. This is the first time we are going to be joining together world-leading scientists with world-leading AI researchers in South San Francisco here in the Bay Area, co-locating with the amazing science and lab understanding that comes with doing drug discovery, we'll be investing $1 billion over the next 5 years to really push the frontier in this new paradigm of science and new paradigm in acceleration of drug discovery.
This is building upon their deep belief that they see this transition of 90% wet lab to 10% compute, imagining the paradigm with that's deeply, deeply flipped in the next coming years, and it's going to accelerate the breakthroughs. We're going to work on clinical development, and we're going to work on manufacturing and lab automation, just like we described that they're doing. Lilly is world-class in manufacturing and accelerating the ability to deploy physical AI throughout labs and manufacturing is also going to continue to be transformative and help them meet the amazing demand that they've created for medicines in the world.
So this has been a phenomenal kickoff to the year. I'm going to leave you with one last video, and then I'm going to join Harlan over there for some Q&A.
[Presentation]
Great presentation. Thank you, Kimberly. I'm going to kick off the Q&A.
We've seen the deployment of massive what Jensen calls AI factories by the leaders in your segment of the market like Amgen, Genentech and recently, Lilly with their Blackwell Ultra base DGX SuperPOD, right? And this potentially signals the shift from pilot programs to industrial scale development and deployment, can you walk us through the economic conversations you're having with pharma CFOs today? Are we at the point where they view GPU compute investments, not just as an R&D expense, but as essential capital infrastructure that integrates AI agents, AI tools that directly determines their pipeline throughput and probability of success?
Yes. I do think, as I was just describing, this is a new paradigm in science completely, and the amazing scientists that -- and if you think of this as a scientist or an employee, you really want them to be as productive as humanly possible. And so if there's a new scientific methods sort of emerging with the ability to take all of -- think of Lilly in many pharmaceutical companies, hundreds of years of science is written down in electronic lab notebooks and it's kind of shoved all over the different parts of the company. You actually have the ability now to build all that back into the Cardinal knowledge of the company, every scientist that works there, it's a very transient industry, frankly.
But why lose all of that deep understanding of a scientist when they leave the company, you can actually inject that back into the system. So from that respect, it's taking all of that amazing high-value data and doing something with it to empower the whole organization and then this transformer moment that we're in is becoming very clear.
I mean we're in the fifth year now post AlphaFold and its true initial impact, and it was the inspiration that has driven a lot of work in the area of models. And now there's thousands and thousands of biology and molecular models being built every single day. And with the sort of democratization, if you will, of what we're doing with open models and the data sets and the tools, we're giving the capabilities for science teams who are not open AI like, they are scientists for the job they're hired for, but they can now become AI scientists and AI researchers just the same because we're making it much more accessible for them to develop these things.
And so I think that the visual of you have agent scientists working along with and then you have a dry lab wet lab. I mean you said it is going to be exactly like the wet lab and you're going to have the dry lab be as intelligent and reasoning with you and calling upon all of the data that you've ever built and all the new data that you're building. And so I absolutely think that it's going to be thought of exactly like your wet lab, and I see this 90-10 flip starting to go the direction. And it's not going to be less lab expense we're just going to do much more science. That's what this is all about. This is not a paradigm in which, if you flip it to computation, you don't need more of that. Absolutely not. Just think about radiologists and the reports are coming out.
Everybody thought because you can read -- you can do the task of reading an image and finding something in it, which is one of the tasks that a radiology does, that we weren't going to need radiologists soon. That was what one of our godfathers said, well, in fact, reports are just out. We've increased the number of radiologists that are being hired because there's more work to do. And so we should just think of this all as we're going to do fundamentally much more science, which essentially will also lead to many, many more breakthroughs.
Last week, we held the consumer electronics show, right, Jensen had a big NVIDIA live event where he announced his new -- the team's new GPU computing platform called Vera Rubin, right? And one of the key highlights of that was Vera Rubin is going to continue to drive cost per token or cost per inference lower by up to 10x, right? And every generation of GPUs that the team has brought to market cost of inferencing, cost per token is going down somewhere between 3x to 10x. This is per year, right? So with that in mind, in hospitals, you've highlighted the labor shortage crisis and introduce agentic AI as a solution for everything from patient triage to administrative coating.
For a hospital CEO operating on razor thin margins, what is the immediate ROI of deploying NVIDIA-powered agents compared to the traditional staffing? In other words, is the cost of inferencing finally now low enough to make this viable for mass market sort of health care adoption?
Yes. And you're right. In the last 4 years, we've had Hopper to Blackwell, to Rubin. And in those 4 years, we've reduced inference by well over 100x. So if you're paying $1 to run an agent, you're now paying $0.01. And you need this, you need this for rapid adoption. And so there -- these companies that I just described, a bridge has hundreds now of millions of users, right? OpenEvidence has hundreds of millions of users, and they're using it constantly. And so we have to continue to drive the cost down.
Now the return on investment is very clear. If a doctor has 30% of their time back, that's either 30% of life that they can return to having with their family and keeping them employed and safe at work or you can also see 30% more patients. I mean it comes with all sorts of benefits. It's a win for the patient. It's a win for the health system. And so you can measure a lot of those companies that we talk to, they're literally measuring how many clinical minutes they're giving back to the organization. And I think it was the Sully and Speechmatics, there's something like 57 years they've already measured since the platform has been in deployment that they've given back to the health system.
So that's clearly measurable in ROI because the more you can give back of free minutes is essentially the more patient throughput you can have.
When we think about accelerated compute and AI, we typically think about use cases, customers as being the large cloud, hyperscalers, your corporate and enterprise partners, but Jensen always reminds us, right, that there's a sovereign AI opportunity. It's a $20 billion per year market opportunity. We've seen Japan's Tokyo One. We've seen Denmark's Gefion supercomputers launched with a very heavy focus on health care and genomics driven by the need for data sovereignty, national competitiveness. Do you view sovereign health care cloud as a stand-alone growth factor for the team separate from enterprise and do you expect every major economy to build a similar type of infrastructure build-out?
Yes. So to answer your last question first, yes, I expect every country to be able to take advantage of this incredible again, once-in-a-generation opportunity. Some countries will go from 0 health care services to complete AI-native health care services, and that's a fantastic opportunity. And now that we've made it so accessible to do so, we've done several things. NVIDIA's platform is inside of every public cloud, NVIDIA has also pioneered essentially a generation of what they're calling neoclouds.
So clouds that are residing within the walls of certain countries, giving every country the opportunity, I mean if you think about what AI infrastructure is, it's just as important as roads, electricity, water, it is a necessary infrastructure for any country to prosper in the future. And so they can get it in the public cloud, if that's good for them, they can start building their own, a lot of telecom companies are transitioning themselves into cloud companies that can be hosted.
You can build it inside your own enterprises, if you like. And so to answer your first question, last, no, it's not a separate. It is all part of our enterprise business. We've just now created the conditions that everybody can, should and will build their own infrastructure to serve their own country to prosper.
Perfect. Well, we are just about out of time. Kimberly, thank you for your participation. Looking for another strong growth year for the team.
Thank you so much.
Thank you.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — 44th Annual J.P. Morgan Healthcare Conference
NVIDIA — 44th Annual J.P. Morgan Healthcare Conference
📣 Kernbotschaft
- Kern: NVIDIA positioniert sich als Infrastruktur- und Plattformanbieter für agentenbasierte Künstliche Intelligenz (Agentic AI) im Gesundheitswesen: offene Modelle, Datensätze und Toolchains plus physische AI (Robotik, Lab-Automation) sollen Produktivität in Kliniken erhöhen und Wirkstoffentwicklung beschleunigen.
🎯 Strategische Highlights
- Open Source: NVIDIA betont Führerschaft bei Open Models und Open Data (u. a. >650 Sprachmodelle, 250 Datensätze auf Hugging Face) als Wachstumstreiber für Ökosysteme.
- Plattformen: Ausbau von Nemotron, Clara, BioNeMo, Cosmos und Isaac für multimodale Biologie-, Robotik- und Lab-Workloads; nvMolKit für Chemie-Workflows (GPU-beschleunigt).
- Partnerschaften: Thermo Fisher (DGX Spark bench-top AI), zahlreiche Startups (Abridge, Corti, Multiply Labs) und Co‑Innovation mit Lilly (siehe unten).
🔭 Neue Informationen
- Ankündigungen: Zusammenarbeit mit Thermo Fisher für DGX Spark (kleiner, vor Ort einsetzbarer AI-Computer); Erweiterung von BioNeMo (u. a. RNAPro) und nvMolKit (bis zu 100x schneller).
- Lilly‑Deal: Erweiterte Partnerschaft: gemeinsames AI‑Labor, Investitionszusagen von rund $1 Mrd. über 5 Jahre zur Beschleunigung von Discovery, Entwicklung und Automatisierung.
❓ Fragen der Analysten
- Wirtschaftlichkeit: CFO-Perspektive: NVIDIA sieht GPUs/AI als Kapitalinfrastruktur für Pharma; Management betont langfristigen Produktivitäts- und Wissenstransfer, liefert aber kaum standardisierte CAPEX/ROI-Rechenwerke.
- ROI im Klinikbetrieb: Beispiele nennen 30% Zeitersparnis für Ärzte; Kimberly stellt Kostenreduktion bei Inferenz um ~100x in vier Jahren heraus, konkrete Amortisationszeiträume bleiben jedoch unternehmensabhängig.
- Souveräne Clouds: Sovereign‑AI wird erwartet; NVIDIA betrachtet das als Teil des Enterprise‑Geschäfts (öffentliche Cloud, Neoclouds, On‑premise), kein separates Segment).
⚡ Bottom Line
- Relevanz: Präsentation untermauert NVIDIAs Strategie, Healthcare als wichtiges vertikales Wachstumsfeld zu bauen: Plattform‑ und Partner‑Wins (Thermo Fisher, Lilly) erhöhen Exposure zu langfristigen Software-, Services‑ und Systemerlösen, kurzfristige Risiken bleiben in Adoptionstempo, Regulierung und wettbewerblichem Druck.
NVIDIA — Special Call - NVIDIA Corporation
1. Management Discussion
_
Welcome, and thank you for standing by. I would like to inform all participants that this conference call as well as any Q&A may be recorded where a company is presenting any recording may also be posted on their website. Views and opinions expressed by any external speakers on this call are those of the speakers and not of JPMorgan. Parts of this conference call may be reproduced in JPMorgan Research. If you have any objections, you may disconnect at this time.
Unless otherwise permitted by internal JPMorgan policy, members of JPMorgan Investment and Corporate Banking are not permitted on this call and to disconnect now. I would now like to turn the call over to your host.
2. Question Answer
Thank you. Good morning. Happy new year's, everyone, and welcome to JPMorgan's virtual fireside chat series at the 2026 Consumer Electronics Show. My name is Harlan Sur. I'm the semiconductor and semiconductor capital equipment analyst at the firm.
Very pleased to have Colette Kress, Chief Financial Officer of NVIDIA here with us this morning. It's been a tradition past 12 years that have Colette and the NVIDIA team kick off the investor event here at CES. Colette's to start us off with an overview of Jensen's NVIDIA live event yesterday, and then we'll go ahead and kick off the Q&A. Colette, thanks for joining us today. Happy New Year, and let me go ahead and turn it over to you.
Okay. Let me first start. As a reminder, folks, to this discussion may contain forward-looking statements and investors are advised to read our reports filed with the SEC for information related to risks and uncertainties facing our business. And then I'll kind of get back to CES and our announcements essence that we were here yesterday doing.
It's an important time for us to remind everyone about the transitions that are taking place in the market today. Those are 3 different transitions and all very important ones. The first 1 is one that we have talked about for several years regarding the need to move to accelerated computing. We're beyond the ability in our current development with using CPUs to advance that work in just the CPU. So folks are moving to accelerated computing throughout the world.
Secondly, the development of generative AI is also a key transition. Those are things that are changing a lot of our work today, whether it be search or any of the social media or otherwise, generate AI is also taking part.
But in the future, we also see the third and important transitions we move to agentic. Agentic AI is really where it is getting work done, work that can augment the work of many employees, many of our folks at home. All locations are really important, we think, going forward. Those transitions are penetration to that, and they're all occurring in creating an exponential growth in terms of our computer.
So that's one of the opening statements that we just kind of want to remind in terms of what we see in AI going forward, but also seeing that we're doing in terms of accelerating today. This event highlights a lot of different focus on not only just AI, and AI for business, but also the work that we are doing in terms of with robotics and really thinking about physical AR going forward.
But an important part of the discussion was talking about our next and upcoming version Vera Rubin. Vera Rubin, as we discussed, has definitely taped out and is ready to go, but this is an opportunity to help folks understand that we are well in good shape in terms of bringing this to market in the second half of the year as we are in full production.
The important part of Vera Rubin, as we discussed that it is different chips. And I think it's important to talk about that -- what that means in terms of different chips, 6 different ships that have been extreme pre-designed to create a data center infrastructure at scale. This isn't about coming gear and talking about one different piece or discussing that says we are designing and orders building out the rack. It's more than that in terms of the design that every piece continues to be fought through its work between each and every single one of those different types of chips.
So 6 chips that we're talking about. First, of course, that Vera Rubin, our GPU that is Vera, our CPU it's our next version of greatest piece of what we can do in scaling up in terms of our NVLink. It also takes us to Spectrum-X in terms of what we have in terms of the super mix, but also what we have with Bluefield and then also our switch for CPO. Six different chips have all been harmonized in terms of what we are bringing to market.
We're excited in terms of all the different workloads that would be able to support but some of the key things that we have seen already is to understand that this is a full system that will essentially be able to take the time to drain down to 1/4 of what we had in terms of Blackwell. Additionally, you have the capability of 10x higher throughput and then thirdly, and an important part in terms of the inferencing phase that we say, it's actually 1/10 lower open cost throughout.
So these parts and bringing that together, we are getting ready for that to continue to scale in the second half of this year. And then we'll be in full ramp as we move into the next calendar year as well. So those are some of the highlights, and we can talk about more of it in the discussions.
Yes. No, that was a great overview, Colette. And Jensen spent quite a bit of time yesterday focused on physical AI. And the team has framed AI physically as a massive opportunity by powered by platforms and bottles like Cosmos, Omniverse, Isaac, right? And vertical-specific frameworks like Group and Alpamayo, right?
Customers are already here at CES. They're already bringing robots in many different verticals to market using Cosmos and Group. The Mercedes announcement yesterday is leveraging the Alpamayo-based reasoning model, right? Is physical AI -- is this already a financially material contributor to your data center revenues? And how should we think about the growth curve over the next few years for physical AI?
Physical AI is yet another great opportunity once we advance the agentic AI. And you're correct, they all are different types of models that are going to be needed for the physical AI. The important part of what we brought to market and what we discussed about is really the need for the open source model. And right now, if you think about the top proprietary models, the next in line is the formation of all of the open models and how important these are.
Now these open models are important definitely for the enterprise and the work that they're doing, but being able to manage for physical AI, the abundance of modeling there coming up in store and what it was being designed, whether that be for research and whether that be entry to developing the content for them, those models are now in service and here today. So here on the CES floor, even here in terms of our offering, we have full. We're about visibility but also what you have in terms of automotive.
Your question stems in terms of are we seeing that today? And yes, Mercedes is coming to market in terms of their very hard work that we have done over the last 8 years to move to a very high-end, self-driving capability in the car, really focused in terms of the safety and the lock. The Mercedes have now been able to take the lead as one of the safest cars that will be in the market.
So yes, we are earning definitely revenue from our work in terms of Mercedes as well as many others that are using our platform, whether that be back in the data center, and that's an important piece to keep in mind the amount of data that is selective and put together in terms of the data here as well as what is also inside of the course as well.
As we move forward, taking that to an area such as physical AR for robotics is also going to be extremely important. The learning, the simulation of ours of what we've seen in terms of automotive carries very nicely for the purpose in terms of what we will be able to do with robotics as well. So yes, achieve part of that, we see much work in terms of our Jetson platform, our Omniverse platform and then also now in terms of our open model helping these important parts of physical AI.
That's great. And you touched upon a very relevant topic in the opening remarks, which is this is a team is in production with your next generation Vera Rubin, an accelerated compute platform, on track to launch in the second half of this year in line with your aggressive product cadence, 6 chips, as you mentioned, in the Vera Rubin portfolio, initial performance relative to Blackwell is very compelling, right? 5x better performance, 3x better training performance.
And as you mentioned, and most important to your customers the 10x lower potential cost per token. As you look at the strong demand curve ahead of you and we've all heard about -- we all track value chain, the supply chain. But if you look at the strong demand curve ahead of you, what are the product areas or categories of the supply chain that you could see constraining your shipments as you start to unlock Vera Rubin in second half of the year. Could it be 3-nanometer wafer supply? Would it be coLOS? The memory, any bottlenecks that you foresee as you think about strong demand ramp in the second half of the year?
Yes. I think it's right to indicate, yes, there's tremendous amount of demand that is out there for both the AI and telecomputing. And we have been focusing on the significant amount of demand. And then the need of what type of supply we'd have to purchase.
Keep in mind, the work that we deal in terms of building any one of these data center infrastructure systems, from the very beginning to the very end. You could be anywhere from 3 quarters to a year to be completed. That means a lot of our supply purchasing is not taking place in terms of what we need for tomorrow, today. It has been in the works, in the works for a couple of years because what it takes is focusing not only on just our supply, but the capacity needs that they have wanted and that is an important part of our processes, thinking through every stable one of our generation and our future generations and working with our suppliers.
We feel very solid about that in terms of what we see in this new calendar year and what we have in terms of supply. As we move forward, it's something to think about as more and more growth goes, how much more can our suppliers did. But we feel good in terms of what we have ordered, what we've have been confirmed for and in terms of our supply that we will take for this year.
That's perfect. Why don't we take a step back for a second as we enter the new year, and does it build the concern focus as it relates to NVIDIA in terms of how the market thinks about the engineered team and the trajectory of growth is as we step into the year, the market is always focused on. And by the time that we second to the year, we already have a pretty good view of customers' CapEx in being change, right?
So the market -- as the market always is very forward-looking, right? And I think the market is starting to think about the infrastructure growth trajectory looking into calendar year '27, right? And if I go back to October of last year when Jensen talked about $500 billion of visibility backlog to found in '26, right? That's both on Blackwell and Rubin GPU fabs, right? And we know that lead times may lack scale based solutions are 9 to 12 months.
And it takes a significant amount, as you mentioned, supply chain management and coordination, capacity buildouts, et cetera, right? But the best proxy, I think, for continued CapEx and infrastructure spending, buyer, customers, if you look at your customers' forecast and orders beyond '26, right, which I assume NVIDIA team is already focused on. I'm not asking you to quantify, but given what you see in your orders and customer forecast, are you already seeing a continued spending growth profile by your customers into calendar?
Yes. So let's go back in terms of our GTC D.C.. That was an opportunity to help you understand that the combination of Blackwell and Vera Rubin together is about a half a trillion through that period time of through '26. But the important part, correct, is thinking of now let's start talking about 2027. And you think that would take to stand up the compute and up a full stage data center.
That is years to do so from the land power shell to finishing up the buildout to eventually in terms of putting in and compute and getting that ready. So where we see our customers and they can see an event like today, they know that Vera Rubin is here. There's already been discussions in terms of how can we think about the amount of demand and where they will put that in their land power cell that they have up and coming in terms of the year '27. So that's the right way to think about it.
We're still working on in '26. There is still a shortage of demand, and they are still looking to are ther quick ads that we could also add in '26 to help fuel what we need in terms of our demand. So both of these things are happening at the same time, but this is being very hopeful to them. They have good understanding from an engineering, what's capable and now they can start thinking through the volume of what they will need for their data center builds. So yes, that is exactly where we're focusing on is on as well.
On the market concerns around an AI bubble, Jensen, and as you mentioned in your prepared remarks, right, you've articulated 3 compute platform ships that are all happening at once which should mitigate a spending level, right? And often feel like the market sort of message is.
And first is the transition from CPU compute to GPU accelerated compute, right? I mean we're seeing this in so many traditional CPU-based compute workloads and dominated segments of the market where, over time, they're moving from CPU compute to GPU accelerated compute rate. Jensen always talked about this, but EDA, chip design software is a perfect example where most of the chip design software workloads were run on high-performance server CPUs not that long ago.
But today, they are all -- many of them are running up GPU accelerated compute architectures. You see that in the simulation market. You see that in the data base markets and so on, right? So that's one of the first sort of transition, right, the CPU to GPU accelerated compute in the existing traditional compute base. The second driver is, as you mentioned, the strong adoption of Gen AI. And the third transition, again, as you mentioned, is agentic AI. And of course, the onset of new foundation models that will power things like physical AI, right?
So we stand along all 3 of those compute platform shows like where are we in terms of the adoption curve contribution to your current data center revenue profile? What specifically looking into 2030, right, let's take a longer-term view on this, but looking into 2030, how are all of these 3 shifts, how are they going to profile into that sort of $3 trillion to $4 trillion of data center spending that the NVIDIA team is forecasting during that period of time?
Great set of questions. First, looking at the accelerated computing. Accelerated computing, it's already here, and many of us have seen it and working with it almost every single day. There's a massive transformation of how search is completed, recommender engines, and essentially almost all in terms of the consumer Internet and how we market through our 2 businesses and our consumers. That's an important piece.
But keep in mind, it is going to be in multiple decades solution to try and get throughout all. There's a lot amount of moving to a software 2.0 and transitioning from CPU to software to a different form of accelerated computing to the software. So we're in the early parts of it. It's moving quite fast. Folks do see the great benefits for the accelerated computing and being able to manage with a significant amount of data there are, you're going to see some time moving forward.
However, moving also in terms of our work that we see with generative and agentic AI. The important part of that also created in an exponential growth in the need for the amount of compute that's necessary. Because one of the very big part of moving to agentic was the long thinking, was what can I do to get a response on a very difficult challenging question and that additional long thinking takes a lot more inferencing demand and takes a lot more token generation as well.
So we are also now seeing a surge in that demand as we move forward. And our vision can see looking at AI as we go forward, has nothing more in the early stages as we move towards these various statistics, data solutions that will augment a lot of the work that we do in our offices as well as we do in terms of personal life. So we know these big markets are driving a lot of this different demand. And in no side do we see any type of shortage or any type of stopping from that.
There's a lot more work to get completed. And the world as a whole still has to get that completed, not just here in terms of some parts and what we see here in the United States. We have a lot of different sovereign AI going on and so that we have many, many different industries.
You have to go industry by industry. You can look at social media, but you have to look at health care, you have to look at automotive, you have to look at industrial, manufacturing. All of these different have unique ways for a perfect work that has to go transition and can be introduced in terms of AI as well. So a lot still to go. And why we indicated that by the end of the decade, we are definitely going to be up there in the multiple, multiple trillions, in the 3% to 4% of the amount that we'll be able to spend in terms of building out the accelerated computing and the AI types.
Maybe more near term kind of focusing on calendar '26. Going back again to Jensen's comments at GTC back in October when you talked about just $0.5 trillion of revenue is in those backlogs of cumulative Blackwells and move in shipments to '26. Obviously, as you move forward in time, you continue to get updated forecast and orders.
Ex China, let's talk about China a little bit later, but ex China, has that $0.5 trillion worth of visibility and backlog number through '26 continue to improve. And at what point are you supply constrained and need to push any more orders into.
So the demand as we see continues to increase as folks are to looking to enable more compute for a lot of areas, the long, tough time and thinking. And so we see this every single day and since our time that we said $0.5 trillion, of course, we've seen new announcements of new deals, new different both focused in terms of the CSPs, the law makers as well as many of our new cloud looking to add more on to that.
So yes, more has occurred, and we are now starting to see folks work in terms of providing the orders. We have orders for Vera Rubin and focusing more and more in terms of thinking out a full year of volume, what you may need in terms of Vera Rubin. So we're in a great position in getting better understanding. We've worked over the many, many years that has the more insight that we provide them in terms of our infrastructure is there, the easier it is in terms of the planning and process of that.
So their demand needs are quite strong, and we are definitely in that process. So yes, that 500 -- that $500 million has definitely gotten larger. And now we'll probably look in terms of next year as well to start building up in terms of all the different demands that we have there. But we cannot say anything more than demand is quite strong.
That's great. No, that's exactly what we're looking for, and that's exactly what we thought. Maybe switching gears because Jensen and the team did a great job, and you did a great job of laying out the performance specs, as I mentioned to you before, right, 5x inferencing performance on Vera Rubin versus Blackwell.
That's on the inferencing side, 3x better training? And then what's most important is the economics to your customers and you guys are driving 10x lower cost per token on Vera Rubin versus Blackwell, but I think the market has gotten a better appreciation for -- you talked about codevelopment and as you bring more systems and rack scale solutions to the market.
It is a solution that is optimize not only around compute, it's optimized around compute. It's optimized around networking. It's optimized around storage and networking right. And so let's talk about networking, right? And lots of focus on networking lately, especially as NVIDIA and the initially transitioned to rack-scale solutions. There's a significant step-up in networking dollar content, given the scale of connectivity with your NVLink networking and switching portfolio, networking attached to your compute revenues was around 19% in your fiscal Q3 of last year. And we define networking attaches networking revenues divided by competed revenues, right? That was about 19% in Q3, at up to 21% in the July quarter.
So on average, about 20% networking attached to your rack scale compute systems, here then the average attach over the prior 9 quarters, which was around 7%, I think, due to the scale-up adoption, right as we move to rack-scale. Looks like you continue to also get traction on spectrum ex your Ethernet product line, is 20% of baseline on networking attached? And as you drive more spectrum and your recently announced Spectrum-6 platform and you've got some GS for scale across, maybe the mix trends move more towards below the mid-20% range in mid- to longer term, right? I'm not sure, but I wanted to get your views on that.
Yes. It's a great way to start here talking about our network. We can definitely discuss where we've been historically and where we see going forward on the networking. One of the ways that we have been looking at the networking is how much in terms of when they are buying the full systems, which always all of them are, how many of them are attracting in terms of networking. And that's a different than looking at it from a dollar perspective, but just the attach rate. It is -- that is a very, very clean metric to understand. That number is nearing 90%. 90% are attaching strong form of all the networking included in there.
Let's remind folks that as our networking business is #1 in the world. From moving to a very, very small scale. But now with the full development of all different types of switching capabilities, best agreed in terms of NVLink. Nobody has even figured out how to even do a lot of what we've done is really establishing both adoption of not only our InfiniBand, which has been a important part for super computing for decades and decades, it is world-class, but the quickness of providing those key features in Ethernet and the adoption of our Ethernet for their businesses as well has been a huge success kind of stepping back and looking at this AI important way.
It's not enough to just have a GPU check, it's not enough to how to base. You're missing such an important part of what the networking does to capture the capabilities of scaling the multiple and multiple ones together, but also dealing with the complexity of traffic and the complexity of responses that you need at some point, we needed training and some point we may be able to manage that all with all of our different inferencing platforms with our networking has been a huge success.
So even as we go forward and move to Vera Rubin, already working at some of the most important capabilities and how important that networking has been there. They are also part and focusing in terms of our work in terms of the switch for CPU. That's been an important part of those to know the amount of savings and capabilities that you can establish through a CPO environment, and we're going to be excited to go to market for them as well.
But really looking at what we see, it's very interesting. Even if they have a part of our compute, very common in terms of networking is still being chosen for different systems. Even if they have one of their own ASICs, they will often use our switching capability as well. So we're in a full design at end-to-end, and we're really excited in terms of how the networking has also been established within Vera Rubin.
Yes. And as a reflection of the traction on networking the team announced its -- you've always been a leader in band switching, right? And as your customers were clearly signaling to the NVIDIA team that they were moving to more of an Ethernet-based switching from the team bought the market has to go your Spectrum Ethernet switching platform. That went from like 0 to $10 billion to annualize in like record time, right? And I think that last you updated us your annualized run rate on Spectrum X was like $10 billion annualized. I think that was in the July quarter.
And the October quarter, that looks like that, that stepped up to sort of annualized run rate for your Spectrum some platform. Jensen and you and the team announced their next-generation Spectrum-6 platform, right? This is 120 terabits per second throughput switch, right? One of the fastest switches in the world. You're bringing that to market with Vera Rubin, right? So if you think about the $12 billion, $13 billion sort of annualized run rate in the October quarter, you've got a new platform coming out of Spectrum-6. You look at your order book for Vera Rubin. Like where could this number on Spectrum be as we move through -- as we move through next year?
So not getting a forecast going forward, but to understand where we already are in terms of the attachment. We're going to see something resonate in terms of our growth in terms of consumer and our growth in networking data time. The only difference that you do have is just the timing of when each of those systems are put together in a full data center infrastructure that they're doing.
You may have -- parts of that networking is the first things that are put in place in terms of the data center and with some of the last part of the data center as networking, that's the only thing that really changes the growth. But so we are expecting nearly these things, not more of an attach rate in terms of what we are seeing in networking and growth moving forward.
Great. And then maybe switching over to China. I know you've got some questions yesterday in the financial analyst Q&A. But following the U.S. government's approval of the H200 sales into China, it appears customer interest actually looks very strong way. So the question is, has the team started receiving orders from approved China entities for the H200? More importantly, how rapidly can the team start shipping H200 to these customers? And how should we frame our kind of revenue opportunity over the next 12 to 24 months?
What I remember Jensen had previously last year quantified the China revenue opportunity for calendar '25, that $50 billion growing at a 50% CAGR, right? 50% growth implies $75 billion of potential revenue demand for NVIDIA this year. Is that how we should think about the China revenue and growth profile and opportunity?
Great question. Let's first talk about the H200. We're very pleased that the U.S. government saw that this was the right opportunity for us to fairly be able to compete worldwide and providing a really good product to China. And that's what this is all about. The ability for us to ship H200 to our customers still requires a license from the U.S. government and the U.S. government work tediously right now on that process in order for them to determine the licenses for the customers. So the customers have requested the licenses, and we are now awaiting that part of that.
But also on the same side, we have heard from these customers from a demand perspective. That's important for us so that we can prepared as those 2 things come together. The POs and the completion of the licenses with the U.S. government will set us on our way to begin shipping the H200 to China. We hope that, that gets done soon.
But again, it's not all something that we can right now control, but we do are very pleased in terms of the U.S. government's decision to do that. So we're going to wait and see what will happen. It kind of steps back though and says, what is the demand in terms of China, it's a very, very important economy and has a tremendous amount of strong engineers and AI engineers compared to also what we see here in the U.S.
So it's also a very big business as Jensen articulated, and it's not a static business. It's going to grow very similar in terms of what we are seeing here in the United States. If we can continue selling, going forward with any of those different licenses that U.S. government has. So more to be determined at that, but let's just wait to see how we can get our H200 out.
Got it. And then on the recently announced a nonexclusive licensing deal with Grok, Grok was focused on this SRAM-based, high-throughput inferencing engine. Very good for low user count and low model parameter influencing, seems like more of an enterprise-focused solution versus NVIDIA's inferencing fronting solutions, which focuses on very high user cloud, massive contact input capability, right, more targeting foundational model developers. I wanted to get your views on the rationale for the Grok transition? And how NVIDIA thinks of integrating their technology into your product road maps and target markets?
We're very pleased to both have the Grok IP with us. And that's what we created with an IP license stemming from Grok and their pieces. But the other most important part of it was an exceptional team that has now joined us as well. You are correct, their work in terms of inferencing, low-latency inferencing has been a lot of work that they have done. We're seeing tremendous engineering horsepower to do so. We found it is quite exciting and something very similar of our thoughts and work going forward as well.
Bringing them onboard with that IT were excited in terms of what the teams could work together. So excited we got it done before the holiday. We have that completed and we're already with -- many of them are already with us beginning that work. So stay tuned. We don't have anything yet in terms of the exact timing when something will come to market but this is an important area.
The complexity of inferencing, the size of inferencing interest market and different needs there's going to be and being such an exceptional team, we will be able to put something great together.
In terms of some of the market concerns that we continue to hear about, right, and one of them is the concern around the gap between a few of the foundational model builders and the current financial profiles and the data center compute capacity, right, that they've committed to over the next pages, OpenAI, Anthropic, et cetera, right? They're committing to a lot of capacity to you, competitors, some of the large hyperscale. Obviously, these AI labs will have to raise money, right? So how do you think about the risk to NVIDIA's business?
The model makers are very both foundational model makers, but also in terms of open source models as well. Most of them, if you look at them as a whole are being a very methodical piece by piece as they continue building a new training model. Okay, let's move to the inferencing and now let's get started for my next and moving in that methodical way. Many of them have had and worked in terms of how do I source to raising of cash, the raising of equity, the combination of the 2 and how do I work that carefully either with the funds or looking at it in terms of on theirselves.
I think a lot of that is very solid diligence in terms of what we'll probably see continue going forward. They are essential. These foundational models are essential from a concept's perspective in terms of what we're going forward. So working and forming and storming with how to get that completed. I think it has gone very well. Sure. They're looking in terms of long term to help us understand. This is not our ability to complete AI in the next couple of years.
This is decades. So they may talk about it in terms of gigawatts of size as we go forward. But the reality is, it's really about the year by year or quarter by quarter, how do they need to build, where do they need to build? Are they in the research side? Are they looking in the inferencing? And I think that process is fine.
Many of them are also with the CSPs. That's a very big help for them. Their -- quality of what the CSPs can provide for them so that they can concentrate on building up their models is a great combination, and we're happy to support that. And many of the work that we are doing is through the CSP and therefore, the model makers is sold, whether those CSPs be in their cloud or some of our long-standing tremendously great CSPs that we've had. It's working quite diligently in terms of all that work. So I think we're going to see more of that to come. But again, we just have to take this day by day, step by step and start to rethink about what they're planning to put together.
Colette, we're at the Consumer Electronics Show. And the one thing that we noticed was a distinct absence of new GeForce gaming platforms this year. And then I guess the question to you is are there concerns of continuing supply of DRAM and HBM memory for gaming? How are you prioritizing allocating these components, gaming versus data center? Do you think that there is potential for demand destruction in the seasonally stronger second half of the year given that especially DRAM pricing looks to continue to increase to the remainder of this calendar year?
Our gaming business has been a homerun where our representation with our gamers continues to be tremendously strong and coming out with what we had with Blackwell was also hitting great strides. At the very beginning, we underestimated in terms of that growth. And that growth was so fast at the very beginning, but we have now brought that up to good level. But given our size of where we are as a percentage of our gaming markets, we're going to contain some both of prioritization, what will they need as we go forward.
But still more in terms of later on in terms of this year and next in terms of to focus. But the best part that we're pleased about is these platforms and enables creative and AR type of platforms that they can use are really an important business model. So stay tuned as we think through, demand is, again, quite strong. And we're going to try and make sure it will not serve as much as demand as we can.
And then my last question, and I appreciate the time spent here. You've guided to mid-70s gross margins, while acknowledging, right, to the potential for rising input costs looking into this year levers matter the most to compact the margin, you get mix? Is it pricing, cost downs? Is it supply chain efficiencies? And where are you least willing to compromise as you think about on these levels?
Yes. It's always an interesting discussion on all the gross margin piece of that. It really showed a focus of us not just getting the confuse out, but doing it very in a great position, both with our manufacturers, our suppliers and in terms of our internal teams in terms of how we can do this well.
We have split very close right now at that mid-70s right now. We don't want to look at this as, yes, we're here to grow, grow, grow that higher. We are here to keep what we said as it is it's mid-70 days right now as we go forward. It takes a lot of different banks. When you work at the complexity of the system, you are focusing in terms of every last patient component. We have already done a significant amount of reordering.
We do understand what it took for the capacity of many of our suppliers and we're very supportive the many different suppliers that have pulled that together. But that now moves us working together with manufacturing. How do we improve that cycle time? How did we think about improving all of the different focus of the business as a whole? Not only can we do better and focus on that cycle time, we did also improve the cycle time of them just getting that to customers and the faster the customers.
Remember, as we move into this new year, we still have a combination of different platforms that we're building. It's not just one product. And that will both enable and also be a mix, so we have to keep in mind as we move into this new year. So right now and what you've seen all of our steps for Vera Rubin as well as what you see with GB300, very serious in terms of that process and getting that together. So we do feel that confidence that will also be something that we can work well. But let's not look at it as something easy. We will continue to work to stay about that same page.
Absolutely correct. We're just about out of time. I want to thank you as always, for your participation and your support. We look forward to strong growth ahead this year for the NVIDIA team and another solid year of execution by the team as well. So thank you very much for your participation and support.
Thank you so much. Have a great day.
Thank you.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Special Call - NVIDIA Corporation
NVIDIA — Special Call - NVIDIA Corporation
🎯 Kernbotschaft
- Kern: NVIDIA positioniert sich als Plattformanbieter für drei Übergänge: Accelerated Computing, Generative AI und Agentic/Physical AI; Vera Rubin (6‑Chip Datacenter‑System) ist tape‑out und soll in der zweiten Hälfte 2026 in Produktion gehen.
- Wirtschaft: Management meldet sehr starke Nachfrage und hohe Networking‑Attachraten, sieht aber kurzfristige Unsicherheiten durch Lieferketten und Exportlizenzprozesse.
⚡ Strategische Highlights
- Vera Rubin: System‑Ansatz mit sechs Chips (GPU, CPU, Networking, SmartNICs, Switches) — Management nennt bis zu 5x Inferenz, 3x Training und bis zu 10x niedrigere Kosten pro Token versus Blackwell.
- Networking: Spectrum‑Familie (Spectrum‑6 angekündigt) treibt Rack‑ und Switch‑Attach; Management nennt ~90% Attachrate beim Systemverkauf.
- Produkt‑M&A: Nicht‑exklusive IP‑Lizenz/Teamübernahme von Grok für low‑latency Inferenz; Jetson/Omniverse/Omni‑Stacks für Physical AI betont.
🔭 Neue Informationen
- Produktstatus: Vera Rubin ist tape‑out und wird laut Management H2 2026 in Produktion genommen — klare Zeitbestätigung gegenüber bisherigen Roadmap‑Hinweisen.
- China: Verkauf von H200 grundsätzlich erlaubt, Versand erfordert weiterhin individuelle US‑Exportlizenzen; POs vorhanden, aber Lieferbeginn abhängig von Lizenzfreigaben.
- Networking & IP: Spectrum‑6 vorgestellt; Grok‑Deal abgeschlossen und Team bereits integriert — genaue Time‑to‑market offen.
❓ Fragen der Analysten
- Physical AI: Nachfrage aus Automotive/Robotics (Mercedes‑Beispiel) vorhanden, aber Management nennt noch keine klaren Umsatzanteile — Wachstum mehrjährig erwartet.
- Lieferkette: Nachfrage hoch; Management sagt, Bestellungen/Capacity seien geplant und bestätigt, nennt aber potenzielle Engpässe bei langfristiger Skalierung.
- China & Margen: H200‑Versand hängt an Lizenzen; Margenziel bleibt in den mittleren 70% (Bruttomarge), getrieben von Mix, Preisbildung und Supply‑Effizienz.
⚡ Bottom Line
- Fazit: Call bestätigt die strategische Roadmap: systemorientierte Produkte (Vera Rubin + Spectrum‑6) und Plattform‑ökosystem sind klare Upside‑Treiber. Kurzfristig bleiben Timing‑Risiken (Lieferkette, US‑Exportlizenzen für China) und Mix‑Effekte relevant; Anleger sollten Produkt‑Execution gegen diese Timing‑Risiken abwägen.
NVIDIA — Special Call - NVIDIA Corporation
1. Management Discussion
Okay. Happy New Year. It's all right. You could talk. How do we get this going? It's almost like we're doing this for the very first time.
I think we have our council...
Yes. First question over here.
All right.
2. Question Answer
Atif Malik from Citigroup. Jensen, you had a slide on the number of tokens, 10x more tokens on Rubin versus Blackwell. The question I have, historically, you have shown a slide of Blackwell performance versus TPUs on training. Any kind of simulation that can kind of put the performance of Rubin on the inference versus TPUs?
It's hard to because the only thing that's available is MLPerf. And we subject ourselves to a fair amount of -- I'm sorry. I had to spit out my candy. I'm just -- I came without a voice. I don't know where I left it. It didn't show up today.
As you know, we subject ourselves to a whole lot of benchmarking because NVIDIA is everywhere, and we're easy to benchmark, but nobody could really benchmark a TPU unless you're the TPU people. And so we don't have anything to benchmark. If you guys have anything to benchmark, we're happy to take a look at it. And I think you'll find that it compares very nicely.
MLPerf is an indication, and MLPerf is quite rigorous. MLPerf is largely governed by Google. And so -- and it's very rigorous. It's so rigorous that almost nobody finishes the test.
We're the only company that has ever finished the test every time. We finish the test every time. We subject ourselves to submission every time. So you know you're allowed to take the test. And if you don't like your own answer after you see other people's answer, you could withdraw your submission. It's the only test in the world that has this kind of stability. And we're fine with it. And so we're usually the one that submits first. And so everybody sees our answers and then they decide whether they want to submit or not. And so my sense is that a lot of people have taken the test. Not all of them can finish it, but some of them have. I just don't think that they submit it for maybe ulterior reasons. And so that tells you something.
If we're the only one that submitted the test and everybody else is empty, I think that just tells you what the answer is. It's not like nobody showed up to the Olympics. They all showed up. They just decided, we'll just let you run by yourself. But one of the best benchmarks is actually semi-analysis. It's a living, breathing benchmark that's there. I like it because it's living and breathing. It's continuously updating, as you guys know. And so I like that fairly well. And I hope that they submit themselves to that because DeepSeek and Kimi, Qwen, these are state-of-the-art reasoning models. They're based on MOE. They're very hard. It's not for the faint of heart. Almost anybody can run it to completion. I mean you'll get a token out of it. But to do it at the rates that we're talking about, it takes super human capabilities at that point. And this is where NVIDIA's extreme code design capability where we're designing across the GPU, the CPU, the NICs, the NVLink switches, the CX NICs, I mean all of that is working together.
The amount of software that has to come together for a rack that I was showing today, that pod that I was showing today, just if you just think about it for a second, aside from the one that you saw there today, I don't think anybody has ever seen one really built from the ground up with that kind of capability. And so NVIDIA wrote every line of code and designed every chip, created all the systems, optimized all the algorithm.
We contributed everything back to open source, so everybody else can take advantage of it as well. And that tells you something about the level of leadership that we have. But that's -- DeepSeek is another example. And I think today, somebody just published, is it Signal65 or something like that. They just analyzed our performance and it shows about a 10:1 from Hopper to Blackwell and 10:1 reduction in cost. And yet between Hopper and Blackwell, the transistor count was only 2x, right? And so that tells you why it's so essential anymore to do co-design at the level that we're talking about. Because if you can't change everything across the whole thing, how are you going to overcome Moore's Law?
If your architecture is basically the same and you're just making a faster XPU, whatever PU you like to build, if you're just building one chip, who cares? Amdahl's law gets in the way, and you're not getting that many transistors anyhow. And so I think we've fairly clearly revealed that in this new world, if you want to keep up with the rate of model size growth, 10x growth and token growth and cost decline and new -- getting to the new frontier, and you want to go fight every single one of these battles unless you're able to keep up with the type of systems we're talking about, I think you're going to have a very hard time.
Vivek Arya from Bank of America Securities. Thanks for the informative keynote. I just wanted to clarify, Jensen, this Vera Rubin in full production, what does this mean? How is it different than what you thought before? Are you able to now ship it and recognize revenue faster? So I just wanted to clarify that.
And then my more strategic question is about your Grok licensing announcement. What does it mean in kind of the near and longer term that are we now getting to a place where NVIDIA thinks that you will need more specialized data, say ASIC-like chips for certain kinds of inferencing. What does it mean for your road map going forward? Or should we expect to see a lot more of this kind of specialization, right? And what the implications are?
Okay. Yes. I don't mean to be pedantic, but NVIDIA's chips are ASICs. As you know, I was the youngest employee at the first ASIC company the world ever created, called LSI Logic. Will Corrigan, I think I was employee 150 or something like that, and I was just a kid out of school. And I was interested in ASICs because it was really about system designers using design tools to design their own chips. That's really what that meant.
NVIDIA in a large way is just a systems company. It's much more natural for us to stand in front of a rack like the Vera Rubin pod than it is for us to stand in front of a chip. And so we're very much a systems company. And we built -- we use design tools in TSMC to build our version of ASICs. So they're ASICs.
There's no question in my mind that it will be -- I think the probabilities of keeping up with Vera Rubin is very low for the industry if you're building one chip at a time. I would say that it would -- I don't want to say impossible, but it's modulo, something close to impossible. It's not a one chip wonder thing. And you're not going to build a chip, connect them in a Taurus point-to-point because it's easier to build the interconnect if one chip connects to another chip connects to another chip. There's no switches to design. But in the case of MOE, you need all-to-all. Every single layer is an all-to-all layer. And so in our case, we literally just send information all to all, to all the GPUs. The GPUs just send information to each other through the switch. It's one hop.
In everybody else's case, you got to pass the token from one chip to another chip to another chip, depending on the size of your pod, that could be 9 hops, it could be 5 hops. And if you're doing this repeatedly as a fundamental part of the processing, it really adds up. And all to all gets in the way. And so I think the -- it's not that -- I think in the case of Vera Rubin, in the case of Grace Blackwell, I think that to build a high throughput, high-performing AI factory, I'm fairly confident with the strategy that we have.
Now the question is, in the case of Grok, I think they came to the conclusion that there's just no nook and cranny to fit into. And so they were quite interested in being part of our company. And I really like the team. And the reason for that is because even though the mainstream part of AI is likely to remain the type of -- remember, the model builders are building for an architecture that would run, right? And whose architecture is the most pervasive in the world is ours. So by definition, the model shape kind of wants to fit the processor. It's a bit of a chicken and egg, egg chicken. And so it's a feedback. It's a positive virus virtual cycle now. And so it's very likely that the vast majority of AI is going to run that way.
I really like the work that they did with extreme low latency. Low latency and high throughput are enemies of each other, just fundamentally. And NVIDIA was built for extremely high throughput. And so the question is, is there a place in AI in the future where maybe the response time is like literally instantaneous. And you're willing to pay something for it. You're willing to pay something for it.
Let's say it's connected to my glasses. It's a use case not today, and let's say it's not a use case of normal. And literally, you're just -- it's like right there in your head, but it's in the cloud. Are you guys following me?
And so the latency has to be super low. It can't afford this a few hundred milliseconds up, a few hundred milliseconds down. You say something, the AI thinks about it for a second or 2 and then responds. It's really hard to have this interactive, very comfortable, the sense of persistent AI. And so maybe there's a place where we might, if you will, create something unique. And so maybe a combination of something like that. But I'm just -- I'm shooting the stuff with you right now. I can't tell you when I'm going to build, but it probably is quite unique and quite cool. But it won't affect our core business. I'm hoping that it expands something new, opens something new.
So let me answer your second part of your question regarding Vera Rubin. Jensen mentioned on stage that we're in full production. If you recall, most recently, we said the tips have shaped out. So we're showing you the progress that we're making, but we're still planning on the second half of this year in terms of bringing that to market.
Yes. Cycle time is 9 months plus, I would say.
You guys are on the cusp of transforming so many tremendous industries, the data center market, which you're attacking. Autonomous, you mentioned a lot about that, robotics. Could you help us understand the scope and timing for some of these key markets that you're addressing? Specifically, I would love to understand the revenue model, how you're thinking about it for Alpamayo and then also timing of robotics?
We started our first -- the first person I assigned to work with me on autonomous vehicles was 8 years ago. 3 years after that, we were able to demonstrate that we could do this. And Ola, the CEO of Mercedes, Mercedes partnered with us to build this into their fleet. It took another 5 years to architect their entire fleet to make it industrial safe to scale at the levels of a passenger vehicle that is quite sensitive about safety. And so the safety technology we implemented into the Mercedes and the driving capabilities took this long, and now we're in production.
Meanwhile, we take all of that technology, and we share with everybody. And so you might have seen on the list, BYD is a great partner of ours, Geely and Xiaomi and Stellantis and just about every robotaxi company is working with us. Nuro, I should have said it. Nuro, I think, is announcing a robotaxi at the show. And so whether it's our data center systems that are used to train the models.
So in the case of Tesla, they use our data center systems to train their models. And we've got a bunch of open source software, as I mentioned, all of our infrastructure stack, some of it's related to simulation, and some of it's related to synthetic data generation and world foundation models. And we make all of that available to the AV industry. So whether it's Tesla or BYD or Xiaomi or you name it, Li Auto and Xpeng and all of these companies are using NVIDIA. Almost everybody who has a self-driving anything has NVIDIA in the data center. And that's billions of dollars, and it's just getting started.
Some of them use us in the car. And so all car dimensioned. And then, of course, some companies in the case of Mercedes and several others are going to deploy the entire stack, okay? So we make it so that we build the whole thing, but we're delighted when the ecosystem thrives and somebody else delivers the whole stack instead of us or they actually build a chip, but we're actually in the data center, and they're actually using our software stack. I'm actually okay with all of that. It doesn't matter.
Just we want a thriving ecosystem, and I'm so confident that we're going to be at the center of it no matter what. And so I think autonomous vehicles at this point, whether it's Waabi building a self-driving truck or Aurora, Chris' company and robotaxis of the likes, okay? So it's going to be quite a large significant business now in the next 10 years. And so how long did it take us? It was 8 years getting here. But then now it's already a multibillion-dollar business. I ventured to say somewhere between $5 billion to $10 billion. And by the end of 2030, by the end of the decade, it's going to be a very large business. It has to be at this point. And so that's how long it takes. But we're not wired that way.
The way that we're wired, we ask the question, number one, is this insanely hard to do? And if it's not insanely hard to do, then why do we do it at all? And so we ask ourselves, is it hard to do? Is it something that we are just uniquely fit at doing? And I think in this particular case, we're fairly singular in our ability to support basically the rest of the ecosystem. There are 2 very good companies, Tesla and Waymo are excellent at doing it for themselves. And we do it basically for everybody else. And I think we're quite unique in our ability to do that. And so -- and then the last part is we like things that take a long time.
I love being part of an industry where at the beginning, nobody even gives you a bit of a never mind about whether this is going to be a growth business or not. And which means if you take a look at almost everything that I do, and I literally play it out for you guys in plain sight. And I know 99% of you just go, that' s 0, that's 0, that's 0, that's 0. And I love that. I love that.
I love the fact that NVIDIA is powering just about every quantum computer in the world, and everybody goes 0, 0, 0. And I love the fact that people call us -- I mean, somebody actually describes us as a gorilla in the health care industry. How could we be gorilla in health care? But that's because we're working with just about everybody, the robotics companies, the imaging companies, the AI companies, the drug discovery companies. And so 8 years ago, that was 0. And so I'm very comfortable with that. And I like these things where it takes a long time. But when you finally get there, it's very likely you'll be quite alone.
It's Ben Reitzes, Melius Research. Great to be here. Jensen, the first part of my question is really serious. What's going on with the jackets? You had a really shiny one in the keynote, and now there's a dull one, and I was kind of going to buy the prior model. Is the quarter going really well? Are we in a multi-jacket business model now? Or what's going on?
There's no question I'm in the multi-jacket life. That's because business is going pretty well. Ben. Thanks for asking. I think that's a perfect tee up for demand is really strong. Demand is really strong. The reason why wear that one is, as you know, we're in Vegas. We can't take ourselves too seriously.
However, I always feel a little insecure walking up like that because there's somebody who's a shareholder probably, and they're -- they didn't know that I'm in Vegas. And they pulled this thing out and here's the CEO of NVIDIA wearing this glitzy jacket. I always wonder about that one person. But I said Vegas. I was very clear. We're in Vegas. What happens in Vegas? And that's why I can't wear that jacket now.
All right. Well, now I have to ask a serious question. You showed that great slide. It was the second to last slide with the 1/10 and the better token economics. I wanted to ask you about Anthropic. They could have got all the trainiums they want. They could have got all the TPUs they want, and now they're going to use a lot of your compute.
Is that slide like did they see Rubin coming and feel like they need to get going? And is there anybody else taking certain workloads that are going to that maybe we're geared for TPU, trainiums or something else that are going towards Rubin because they got to get part of that token economics.
We really would have -- if I could have rewound time, I would have made some different choices with Anthropic because in the beginning, they really wanted to work with us, but they also need a funding. And at the time, we just didn't have the resources to make funding to another start-up at the scale that they needed. But there were 2 other companies that did. They were already quite large.
Amazon and Google were already quite large. And so they were very supportive of Anthropic in the beginning, and we couldn't afford it. And so I think that, that was -- that's kind of my excuse, I guess. But the thing where Anthropic is now, nobody is generating more high-quality tokens because it's delivered for a very -- some of the most important use cases in enterprise, which is coding. And their Anthropic cloud code is really, really good. And as you know, one of the challenges for cloud code is that their token generation rate is too long. And for software engineers, you really want to iterate. And so you need the token generation rate to be supremely fast because you're iterating. It generates an answer, you might not like it, it generates another answer. And so you're iterating with this AI assistant.
The iteration process requires fast token generation, and I think we could add a lot of value here. And so I think the -- I'm really happy that we're working together. NVIDIA is now the only platform in the world that runs every model. We run X, we run, of course, OpenAI, Gemini, we run Anthropic. And we literally run every single open source model in the world and whether it's physical AI or cognitive AI. And so the ability for us to be the every AI company, to be the every AI company, I think, is a really great positioning because when you're trying to build something, you have no idea when you're -- if you're an enterprise company or you're building your own cloud, let's say, it's a sovereign cloud or you're an AI lab.
You really have no clue where the journey is going to take you. And so you want a platform that runs everything and that the ecosystem is really rich. And so I think with the Anthropic announcement last year, we became -- that was really the last one, the only one that didn't run on NVIDIA. And so I'm very, very happy about that.
Thank you for doing this Q&A session. Aaron Rakers at Wells Fargo. I appreciate the conversation. There's a lot of dynamics going on around the supply chain, be it DRAM pricing, be it supply availability. But I'm curious if you could talk about that a little bit. And I guess the second part of that is there's been recent discussions around power and how -- I'd be curious on how you see the power envelope playing out. Is that a limiting factor to what you see as far as this build-out? Any updated thoughts around that?
Our supply chain goes upstream and downstream. And our advantage is that because our scale was already so large and because we were growing so fast already at such a large scale, we were preparing our partners for this large ramp quite some time ago.
All of you have been talking to me about supply chains for, what, 2 years now because of the scale of the supply chain that we have and the growth, not just the rate of our growth, but the scale and rate of our growth is -- every quarter, we're growing the size of an entire company, isn't that right? And that's just the delta, okay? And entire giant company, we're not talking about the start. We're talking about -- we're growing a publicly traded chip company every quarter. And so all of the supply chain stuff that we did with MGX, the rack level.
The reason why we're so thoughtful about how to improve all of that and standardize the components and don't waste the ecosystem, don't waste the supply chain for our partners and all the investments that we've made in them and many of them, we supported with prepay for -- so that they could build out their capacity, not tens of billions. I mean we're talking hundreds of billions of dollar stuff that we're helping the supply chain get primed up. And so I think we're in a good position because we've had such a long relationship with them. And remember, we're just about the only chip company in the world that buys DRAM.
If you take a second, you take a step back, we're the only chip company in the world that buys DRAM. People have asked us why do we buy DRAM? Because as it turns out, turning that DRAM into a CoWoS into a supercomputer is supremely hard. And getting that supply chain really plumbed up, it gave us a huge advantage.
Now that things are in a tough spot, we feel fortunate to have the skill. And then we also -- speaking of power, so look at the number of partners we have upstream. The number of system makers and we're buying memory and they're buying memory, we're buying multilayer ceramic capacitors and they're buying MLCCs and right, they're buying PCB and we're buying PCB, and we're getting everybody all around us.
Now look at us down -- so the diversity and the scale of our supply chain backwards upstream is a huge advantage. We're the only chip company that buys directly tens of billions of dollars of DRAM from all the DRAM makers. We buy from every HBM partner. isn't that right? We certified and qualified every one of them and got them all prepped. Now go downstream.
We're the only company in the world that works with every cloud in the world. Isn't that right? The outlets of our technology is incredible in every country, big ones and small ones and start-ups and sovereign nation ones and government-funded supercomputers and labs, and we're working with every single one of them. And so we have partners that are so diverse and broad. So our outlet downstream is really good. Not to mention, as you know, we realized this importance of knowing downstream our supply chain that's going to affect our growth that we invested in, partnered with, supported land power shell companies and people asked us why we're doing that to get ready for today because we have to see our supply chain from all the way back from the equipment makers. That's why from AMAT to ASML or partners, all the way down to setting up a supercomputer and generating the first token.
If you just look at that entire path, you will see partners, customers, NVIDIA invested companies up and down that thing because we're thinking about it constantly. Getting ready -- getting the world ready to remodernize $10 trillion of the last decade's IT investment. Isn't that what we're trying to do? You just got to -- what are you trying to do? Say it out loud, what does that all mean? It's not a PowerPoint slide. This is a giant endeavor what we're trying to do, and it happened. It's happening on first principal reasons. I mean are we building these things and would there be consumption? We fundamentally changed the computer. You got to -- you can't use that last one. You got to use this one. And every customer knows it. Everybody in the ecosystem knows it.
Stacy Rasgon with Bernstein. I have 2 questions. Maybe one for Colette and one for you, Jensen. Colette, and maybe it builds on the supply chain points that you just made, Jensen. Around China, so clearly, now there may be an opening to sell parts there that were not sort of being contemplated before. Is the supply chain right now robust to support, I don't know what we might call a meaningful ramp without impacting the rest of the business and the supply you've already secured for the rest of the business?
And Jensen, if I could just very quickly touch on -- you made some comments on DGX Cloud. early in your talk. And you talked about that was not a market you're trying to enter. But I mean, you clearly were trying to enter it at one point. So what's changed? It feels like something has changed there. Was that customer pushback? Or was it just the opportunity set or what?
Let me take your supply question regarding H200 and what we may do for China. We have plenty of supply for all of our different countries, particularly in the United States to meet all of the different demand that we have here. So what we will do for H20 is just supply that we will have specific for China. So we're not going to take away from anything we already have with all of our different countries and their orders and demand. So we're still awaiting where we are with the H200.
First, with the government, the government has received licenses, and they are working in terms of how do they want to process those different licenses. On our side, yes, we do have demand as definitely from China, and we just have to make sure we've got a togetherness across all the different governments on getting able to ship that.
Yes, you should be happy to hear demand strong.
Demand is strong. We got that.
Yes. DGX Cloud was always -- has always been located inside a CSP. So it was never intended to compete with them. It was intended to do several things. One, prepare them for the new architecture. And because of DGX Cloud, that first mission statement is preparing them, 100% of the world's CSPs were not AI cloud companies, 100%. 100% have no clue this world. And the first time I met them, 100% rejected us. No, we don't need those kind of things. And so DGX Cloud was engineered, was created as a strategy where, okay, in that case, because I need it for my own AI models, I'll work with you, I'll partner with you to build it in your cloud. And then after we're done, you have an exemplar NVIDIA Cloud, okay? That's number one, so that we can use it as a forcing function because it's a business transaction both ways.
It could be a force because I need to rent it for my own AI models. So I have a strong need. And instead of me setting it up myself, I put it in their cloud, number one. Number two, we attract developers there. There's 20,000 AI natives. You guys, probably hasn't gone unnoticed to you guys this last year, about $150 billion of investment into AI natives because -- and these AI natives, one of the things about AI natives, it means they have cost. AI companies, enterprise companies in the past had very little cost. AI native companies have very high cost, infrastructure. And so when we work with all the AI native companies on DGX Cloud, they become a great customer for the CSP second. So we're really using it as a customer attractor for them.
We also do the third thing, which is all the models that we created, we landed in their clouds so that we can connect our friends like Siemens and Synopsys and Cadence and ServiceNow and Adobe and whenever they work with us, they're essentially going to land inside one of the CSPs. We are one of the best salespeople for the CSPs. We attract so much customers to the world CSPs is incredible. And that's the reason why, on the one hand, they build TPUs, they want to compete with us. On the other hand, they're so gracious about us being in their cloud because we bring customers to them.
NVIDIA runs every AI, as you guys know. And so the fact that we can put ourselves into their clouds, do I need to do that anymore with DGX Cloud? I think increasingly, the answer is not, okay? Because I think the flywheel has started. However, I do need a whole bunch of capacity myself because, as you know, the world's #2 cloud is actually -- the world's #2 AI is actually open source AI. The #1 is OpenAI today. We can all acknowledge that. They generate more tokens than anybody. More services are connected to them than anybody. But the #2 solidly is open source. And so we have to go make sure that we continue to invest in building these open models, and we have now established that we are a pretty awesome frontier AI model builder, and we contribute tremendously to the ecosystem. And everything we contribute benefits our platform.
Everything we contribute benefits the ecosystem and also benefits the verticals that we're going into, self-driving cars, robotics, health care, so on and so forth. And so I think the flywheel, the equation, the strategic rationale is really solid in all of it. And I don't need to rent one token. Now how is it possible I can rent from them and re-rent it and make money. It doesn't make any sense anyhow. Not to mention I'm actually sitting in their cloud. There's no differentiator. And so it was never intended to be a business.
Got it. That's helpful.
It was intended to be a very clever strategy. It turned out to be quite clever.
Jim Schneider, Goldman Sachs. I was wondering if you can maybe talk a little bit about the context memory storage control you announced today. How important is that across a range of use cases? Do you see that as being a bottleneck to performance of a certain subsegment of customer problems? And should we expect you to sort of continue innovating on that vector similar to what you did in networking in the past?
We're the largest networking company in the world today. I'm expecting us to be the largest storage processor company in the world. That I think is likely, very likely to happen. And it's very likely that we will ship more high-end CPUs than just by anybody else either. And the reason for that is because if you look at Vera and Grace and Vera, they go into the SmartNIC of every single node, okay?
We are now the SmartNIC of AI factories. There's other -- a lot of CSPs have their own SmartNICs like Nitro, and they'll continue to have their. But outside, BlueField is incredibly successful. And BlueField-4 is going to knock it out of the park. And so the early -- not the early adoption, but the adoption of BlueField-4. And the software layer on top is called DOCA, rhyming with CUDA. And so DOCA is adopted all over the place now. And so for networking, east-west traffic, high-performance networking, we're the largest.
For network isolation, north-south networking, I'm fairly certain we're going to be one of the largest. For storage, that is a completely unserved market today. The way that storage works is SQL. SQL is structured data. Structured database is lightweight. AI database KV caches insanely heavy weight. You're not going to hang that off of your north-south network. I mean that's just a horrible waste of network traffic. You want to put it right into the computing fabric, which is the reason why we introduced this new tier.
This is a market that never existed. And this market will likely be the largest storage market in the world, basically holding the working memory of the world's AIs. And that storage is going to be gigantic, and it needs to be super high performance. And so I'm so happy that the amount of inference that people do has now eclipsed the computing capability that the world's infrastructure has. And so we now -- the amount of context memory, the amount of token memory that we process, KV cache we process is now just way too high. You're not going to keep it -- keep up with the old storage system. And so when an inflection point in the market happens and you had the vision to see it happen, and then this is a brand-new market because of this inflection. This is the best way to go into a market. And BlueField-4, there's nothing like it, absolutely nothing like it.
Will Stein from Truist Securities.
Someday, nobody is even going to ask me a question about GPUs. I'm going to -- but don't worry, I'm going to have to throw it out there.
My question is about the ramping velocity of Vera Rubin relative to the 2 prior generations. And in particular, when you -- as you said, you're in full production today with Vera Rubin. As that starts to get rev rec in the second half, we certainly expect Blackwell will still be going as well. Will Hopper also be still present? Will you have 3 architectures at once?
And regardless of that question, maybe you can talk about the margin impact because it seems like your ramp for Vera Rubin should be much faster given the discussion of the time to manufacture you addressed today in the Q&A or in the presentation.
Yes, I appreciate that. Vera Rubin's ramp should be fast. The challenge for Vera Rubin, it is the only computer in history where literally every single chip is new. I don't think you could buy a phone where every single chip was new. Even the high-temperature capacitors were new, never existed. Even the HBM4 never existed. LPDDR5 SOCAM never existed. Are you guys following me? I'm talking about the stuff. I'm not even talking about my chips yet.
Literally, every single chip in that computer was brand new. The fact that we made it work at all is a miracle. The fact that we made it work perfectly is just incredible. And so that there were -- that was where the risk of Vera Rubin was because there were so many new technologies coming together. You got co-packaged optics coming in on a switch, the largest switch ever made. You've got all these different things, all these technology. And so we had the wisdom of breaking down the problem, and we were working on Vera Rubin for several years, piecing together the technology, derisking the important parts of the technology and derisking important parts of the supply chain for several years now.
Vera Rubin is not a 1-year project. Vera Rubin probably something close to 5 years. And so we take the most difficult parts, and we derisk every part of it. And sometimes we'll even mix it in with some other technology, and we've actually already shown it. It just wasn't a differentiating part. It wasn't the part that mattered. And we built these pieces, notice, we've been building BlueField-4 for some time. But I derisk the CPU, we need it to be -- because you have so many -- so much storage, the memory -- the performance per watt, the energy efficiency of the CPU has to be insanely great. You're not going to get -- you're not going to put up that rack of BlueFields and put your favorite x86 in there. That's just not going to happen. It's either not fast enough or is it going to draw too much power. And so we had to build a CPU that was -- that had the data rate, the perf per watt was as low as Grace. And then we built -- because the SerDes, we couldn't build it that fast several years ago.
We had to wait until the SerDes came along. And when everything came together, that's when your BlueField-4 was realized, okay? So now we're on this rhythm. BlueField-5 will be easy to do. BlueField-6 will be easy to do. But that's derisking all of the technology components of Vera Rubin was just a massive undertaking. And so now we've derisked -- now everything has proven to work. Everything is in volume production. I'm just so incredibly proud of the team. I mean when you just look at all the chips, not just our chips, but other people's chips that have to work with our chips. It's a miracle. But the problem is if you don't do that, then we literally had 1.5x more transistors and call it 1.7x more transistors. Who cares?
Are you guys following me? You're not going to rack up a whole new AI factory for $50 billion for 1.5x. You're just not going to do it. The bar is too high now. Just you got to imagine that person. It's not about, oh, 50% better camera, sure, I'll pay for that. It's 50% better AI factory. You're not going to pay for that. Does it make sense? It's $50 billion. You're not going to do it for 50%. You do it for 10x. You'll write a check for 10x, but you're not going to write a check for 50%, which is a challenge because as all of you have heard me say, Moore's Law is over. Moore's Law is completely over. If every single year, we're getting 50%, that -- and AI is going -- we're not getting 50%. But every single year, we're getting probably something like 20 -- 15% to 25% tops out of transistors these days. But AI is going 10x per year and token rates going 5x per year, there's no way to keep up. And so we have to do something like extreme codesign and really basically revolutionize everything.
Srini Pajjuri from RBC Capital Markets.
We're going to ramp Vera Rubin pretty fast. This is easier to ramp than this. This is literally what all of the world's computer companies are building right now. Just enormous workforces, enormous factory floors, building this tray. This compute tray is the limiter of the world's AI factories today. This is what everybody is building.
Look at the number of connectors, and so we realized the labor content that went into this. And then once you get here, literally, this was 2 hours. You could just watch them. It takes 2 hours. Assembly of people's like operating room standing around this thing, building it like a car. And now you're happy to build this like a car because, of course, the economics is better than a car, right? This is a full car, okay? And so you're happy to build it like a car, but you'd rather build this, and the reliability is going to be better -- it's called RAS, reliability, availability and serviceability.
The one thing I didn't say today, I should have remember, so NVLink, this generation, Vera Rubin, you could -- the world's first networking system, it's hot swappable. You take part of the network, you pull it out and you could service the rest of it while it's still running, update the software while it's running. It's craziness. So the goal is just to keep that entire AI factory running all the time. Does that make sense? You just paid $50 billion for it. You don't want to -- the concept of downtime, you would go insane.
And so just the amount of technology and innovation we -- from all the learnings of working with everybody that went into Vera Rubin, all the things that I didn't say, it's just -- it's a miracle. And so we're going to try to ramp it as hard as we can. Second half, we should sell lots, ship lots.
Jensen, up here. I guess my question is more about the longer term. How do you see the frontier model market shaking out? If you look at the history of tech over the last 20, 25 years, there's always been 1 or 2 winners. We still have a lot of frontier models and a fairly fragmented market. At least from my usage point, they all seem pretty similar to me. So I'm just curious as to hear your thoughts about how do you see the market shaking out?
Do you think AI is so different that we need so many frontier models going forward? Or if there's going to be a shakeout, what do you think will cause that?
Yes. Good question. As soon as -- like, for example, if everybody I work with is smarter than I am -- they're all smart enough. Okay. Are you guys following me? If all the AIs are smarter than we are, then they're good enough. And so that's what you're feeling. You're feeling the good enough for general use. But for domain-specific use, they're hardly good enough. And that's where I think you're going to see a fair amount of breakthroughs this year in vertical agentic systems because it's -- I think it's not easy to just boil the ocean and make every AI be great at, on the one hand, chemistry and the other hand, biology and also drive a car and do it effectively and efficiently. I don't think you could reasonably do that.
I have a fairly good understanding of the technology. I don't think you're going to reasonably do that. The technology is related, but optimizing for all these different domains is quite different. And the specialization of the workflow, the flywheel, the data, the training, even the people to evaluate it, it's very vertical. And so I think long before I'm worried about the foundation model industry, I think verticalization is clearly going to happen.
Now are the verticalization going to happen outside the foundation model companies or inside the foundation models? That's a different question. However, in many of the companies that you're thinking about, remember, they already have verticals. Meta already has very deep verticals in digital marketing and right, in AI ad serving. And so they have a lot of verticals that they're very, very good at. And it's not likely that they're going to -- they need to or it's not likely they have -- they will give up that model creation to somebody else because that vertical is too important to them and they own the channel, they own all the expertise.
You could say the same with Google and you could say the same with X, you could say, does that make sense? I mean there are several different companies where their go-to-market is so good because they're already domain experts in those verticals. I think that, that flywheel is going to continue to sustain.
In the case of OpenAI, they are the Google of our time. And I use all of the AI models, but I kind of, for some reason, always go back to ChatGPT. And I kind of always go back to Google. And so it's -- I think they've become that. And so that's a great outlet for them, the consumer outlet.
In the case of Anthropic, their enterprise capability is really, really good. And so maybe that's their angle. I think that ultimately, people -- each one of these model makers have to find an outlet, a channel that they very, very significantly secure. And otherwise, for everybody else, it's going to be open models.
I just mentioned 5 of them that actually are fairly secure. But for everybody else, I think open models is likely to be the answer. And then they're at the frontier. They're not the frontier, but they're at the frontier. Maybe they're months behind. And we'll keep everybody there. And companies like Lilly and companies like ServiceNow, and -- they could take it and create their own version of AI, the ServiceNow AI, the Snowflake AI and then just -- but they build it off of open models. My sense is that, that's likely the outcome for the future.
Jensen, over here, Louis Miscioscia, Daiwa Capital Markets. So congratulations on the Mercedes launch after 8 years and pointing to autonomous vehicles driving finally hitting now. Can you give us some thoughts about where the industry is for agentic and physical? You talked about it last year, you talked about it this year. And there is a lot of examples, but do you think we're going to see critical mass volume in 2026 from these areas, mainly from an inference standpoint and deployments?
If not for Agentic AI, there would be no Cursor. Cursor is agentic. If not for Agentic AI, there's no OpenEvidence. It's an agentic system. Almost all of the best AI native companies that you know are Agentic AI companies. But we call them Agentic AI today because we're trying to say that it's different than one shot or models that don't use tools.
In the future, that's just called an AI application. There's no question in my mind now, this framework of AI systems is likely going to be the basic framework for building applications in the future. And so instead of off-the-shelf libraries, you're going to have off-the-shelf models and off-the-shelf agentic systems. And they're just going to -- you're going to plug them together, you're going to tell them what your goals are and they're going to try to work together. And so getting applications deployed and working together is going to be just easier and easier and easier. And so the technology is hard, but using it should be easier.
Today, the technology is hard and using it is hard. And that's the reason for that is because the technology is not good enough. But as you see that in the last 2 years, the AI technologies have gotten so good that using ChatGPT for research and solving a lot of questions has become so much easier. And this is going to be the same with AI applications, and they're all going to become agentic.
Alpamayo is an agentic AV. It's agentic in the sense that it looks at the world and it goes, I've never seen the circumstance before. What's going on? It breaks the problem down into things that are fairly routine. And it goes, I know that. I know that. I can -- based on that, this is what I would recommend not do. And so it's a reasoning system. And it's using -- it's using common sense, physical common sense that we taught it. And if you ask it to take you something -- it's a full -- it's called a VLA, vision language action model. It's a full robotic system. So you could just tell the car to take you somewhere. If the car is about to do something and you're not sure -- why are you doing that? The car will just talk back to you. And so it's a fully agentic system.
Joe Moore from Morgan Stanley. I wonder if you could talk about how to think about sizing the physical AI market. It seems intuitive that this is a very difficult problem to solve because of simulation, there's a lot of dollars that needs to be spent. But when you did this with large language models, you had cloud CapEx budgets that just sort of shifted towards you in the early stages.
For Physical AI, do we need to see companies raising money? Is it going to be automotive and industrial companies? Just how do we think about what's going to fund what seems like a pretty big project ahead of us?
Physical AI has the benefit of riding on the shoulders of the large language model. And it's called multimodality, number one. It's just -- it's able to understand vision and language at the same time. It's multimodality, and it's aligned meaning that -- let me see if I can give you an example. So in your brain, and I'm pretty certain of it, this is true. C-A-T, those 3 words, Meow and the picture of a cat. To you, it's exactly aligned in the same place in your brain. It's aligned.
Over the years, we've aligned -- it's multimodality alignment. And so you have to use this concept called cross-attention and you're learning 2 things at one time. And then your vector space, it becomes same, okay, or in the same geography. And so we have the benefit of large language models that were already trained. And so we take essentially Nemotron and then we take something that is trained very specifically on vision and world models, which uses a lot less data.
In combination, we create Cosmos. So Cosmos still took tens of thousands of GPUs several years, but it's less. But now that we have Cosmos, and we give Cosmos away to everybody. That's the point. We're trying to lower the bar. You still have to do your own fine-tuning. You still have to do your own domain adaptation. And so we created Cosmos so that we can lower the bar for everybody to have Physical AI. If the bar for physical AI is such that we have to do what we're doing here, replicate it 3 other times, one for digital biology, one for physics, obviously, it's unnecessary and it will take longer. And so we -- in a lot of ways, we burdened not all of it, but probably 1/3 of it for the world. And now they can take this and run with it.
Jensen, thanks a lot for doing this. Colette, thanks a lot for doing this. Ananda Baruah with Loop Capital. A neocloud question for you. I would love to get your view on how big picture, the neocloud to the GPU clouds fit into the space structurally. NVIDIA has continued to deepen its partnerships with the neoclouds.
At this point, the core customer bases from the AI labs to the hyperscalers, sovereign and now even enterprise are moving closer to embracing the neoclouds. So I would love to get your take on the role you see them playing big picture. And then just as a quick part B, to the extent that enterprise and sovereign adopt neoclouds, does that help NVIDIA sell their Enterprise OS?
We believed in the existence of this category because the technology is still changing fairly fast. When we started cultivating what people call neoclouds or at the time, they were just called GPU clouds because that's all they had in their clouds. We started cultivating them because -- and partnering with them because we realized that the AI technology was moving fairly quickly, and we knew that our technology was going to move fairly quickly. So the market was moving quickly.
The technology is moving quickly. And the access to land, power and shell was also not something you could take for granted. And so in this combination of circumstances, we felt that a community of fast-moving, agile regional players would likely be quite successful. We were right. And so Enscale is a European region. You have Yotta in India. They're a regional player. And of course, there's CoreWeave that we know about, right? And Lambda that people know about. And so we have all these different regional players, Humain in the Middle East. And so there's a lot of different regions.
My sense is that you're going to find more. And each one of these geographies are going to build up their AI infrastructure. And that AI infrastructure and community is going to have to include everything from researchers and start-ups and isn't that right, and large companies. And oftentimes, being where your customer is makes a difference. And so that's the reason why we did it. And now it's given us a network of outlets to the marketplace and partners who are constantly seeking out land, power and shell opportunities, speaking back to the power challenges, but customers and partners who are trying to build this ecosystem with us. And so we're quite informed as a result of all that.
It's Chris Caso from Wolfe Research. The question is about margins and really sustainability of margins at these high levels. It's one of the most frequent questions we get from investors. I know you've answered the question for this year that they stay in the mid-70s. But 2 questions on that. One is, what's different with the Rubin ramp that allows you to maintain those margins in the early stages of ramp because typically, your margins will come down in the early stages. And then longer term, how does NVIDIA continue to price to value in the face of competition? How do you maintain these high margins?
I think that -- so for the ramp questions, it's just whether it's in the plus or minus 0.5%, that's kind of obviously hard to predict, nor I think is that your question. In the long term, our margins are exactly directly related to the value that we deliver. And I simplified the world of value creation down to basically 3 charts. And those 3 charts are insanely hard to get done right.
The first one is how many GPUs does it take for you to train a reasonable sized model in the time that you need to train it. And so you've got to get to market every year, and you need to iterate on that model several times. And you like to win the front -- you like not to lose the frontier. And so your training capability matters a lot. And that's not just a flops thing, because the shape of that curve, the shape of that curve that I showed you is directly related to co-design. It's got networking problems. It's got memory bandwidth problems. It's got NVLink problems, it's got software problems. It's got every problem.
The shape of that curve, otherwise, it would just be, have you noticed, I'm the only CEO that shows you guys shape of curves instead of just a bar. Do you guys know the bar? You just pick a point. It's like that's not life. Life is not like a bar. And so life is -- and I can't show you the multidimensional curves that we have, and we call them frontier Pareto, Pareto frontiers. And we sweep across a multidimensional space. And that's the simplest version that we -- and it's the one that makes sense most to deliver value. And so there's the training side of it, there's the inference side of it, and then there's throughput side of it.
The inference side of it has to do with your -- the cost it takes to generate the tokens, given the latency, the quality of service requirements of that time. People are expecting more and more tokens generated, which means that the token rate has to be much shorter and yet you need to deliver it much more cost effectively. That multidimensional problem was described in that chart. And then the next one is in the final analysis, if you're going to build your infrastructure this year, you better be darn sure that your revenues will go up next year. And that has -- and you've got only 3 gigawatts or 2 gigawatts or 1 gigawatt. That's it. So within your 1 gigawatt, your perf per dollar, your perf per watt, that last chart that I showed, it was the second or third. That chart is insanely hard. That's not a bar because the workload differs across all these different scenarios. And so your job is to, one, support your research team to get to the frontier.
Two, once you have a model, you better make money at it because otherwise, it has to generate tokens fast enough for a good quality of service. And then three, the entire data center has to generate enough revenues that the CapEx you put down next year is going to cause your company to grow, not shrink. And so these are such complicated issues, and I distilled it into one chart, but there's mounts of simulators behind it. And to the extent that we can continue to do that, at the levels that we're doing that, it's easy to go 1.5x. Actually, I take that back. That's not even that easy, okay?
If you build a chip that's 1.5x more transistors, you put it into that supercomputer, you don't get 1.5x, you'll get 1.2x because of Amdahl's law. And so you have to do extreme engineering, extreme codesign across all of this and invent things like NVFP4 to bust out of Moore's Law. Are you guys following me? Otherwise, you're like everybody else. And so that's the challenge.
I think our ability to sustain value, which is deliver the value across the dimensions that I've said, is fairly -- I wouldn't say it's singular, but I would say it's fairly close to singular. And we're doing it at the scale, and it's getting harder every time. As you know, it's getting -- obviously, it gets harder every time. And so I think -- so that's -- those are the dimensions. You take the performance. In the end, it's about that 1 gigawatt data center. That's the simple math. It's $50 billion.
Does your $50 billion deliver more revenues than somebody else? Do you deliver more tokens than somebody else? Okay. It's $50 billion. If somebody else is willing to do it for $40 billion because $20 billion of it is just land, power and shell, isn't that right? Chips are free, still $20 billion. If everything was free, it was still $20 billion. Are you guys following me? Okay? And so you better be sure that whatever you put on top of that $20 billion, you're going to be happy with. And so that's why I think the difference between good margins and poor margins. It's such a small difference across that $50 billion. So long as we continue to deliver the throughput, I hope that customers will continue to reward us for that. But that's our basic strategy.
And then today, I spoke about the new dimensions of that data center, which is we really want to invest in making sure that the resilience, availability, reliability, availability and serviceability of the data center is world-class. And that's very, very hard as well.
So Jensen, following up on the previous question on the AI models. What do you think might be the game changer for the current competitive landscape? Because since now, it seems we are in an oligopoly where 3 to 5 big players dominate the market in different verticals. But any game changer in the next several years? And what would that be?
And my second question is, seeing the AI token is growing 5x year-over-year. And when would that to reaccelerate or slow down? And what would be the catalyst for the change? And my final question that what's the next milestone for both AI and NVIDIA that you are looking for that you are super excited about?
If I told you the last one, you're not going to come to the next one. Some of the big breakthroughs that are still right around the corner, how to deal with memory is a big deal. And today, context lengths could be 100,000, it could be 1 million. But clearly, the right context length is infinite. And so how do you deal with infinite context length where you have to do attention across all of that. So there's some good research that needs to be done there.
We're playing with a lot of research in this area ourselves. That's why our model is hybrid. State space models, the SSMs are highly efficient at compressing context and -- versus transformers. And so we've created a hybrid version. And so Nemotron 3 was very efficient at context processing and token generation. I think there's a lot of research in this direction but still -- I think the -- some of it is related to the system architecture, as I mentioned today, BlueField-4 to bring memory closer. It's kind of like your long-term memory is not -- doesn't sit in your head. It sits on the network somewhere. We just took your long-term memory and we put it in your brain, okay? We put BlueField-4 in the same rack. And so we're going to put your long-term memory in your brain. I think it makes a lot of sense.
There's a lot of continuous learning type of things, which is you come up to a circumstance you don't know before, you reason about it and you still don't want to solve it. Maybe it's a domain. You just don't even have first principal knowledge of it to even break it down. And so you might know that you don't know it, to know that you don't know something and to go study up, go do research so that you have some first principal knowledge of something and then you come back and you reason about it. So it's kind of almost like the AI first, you prompted, and it doesn't know the answer at all. But it's okay, behind your back, it just went and when it did some research and I learned it. Now I can reason about your question, doesn't make sense. And so continuous learning doesn't have to -- I think that there's a fair amount of research opportunity there.
And your second question, what's going to be the inflection -- the next inflection of token rate generation? The 5x came about because of reasoning. I have a feeling that we have a 50x in front of us. And the 50x in front of us is because of agentic systems. The reasoning is going to come with it, tool use and planning and simulation. It's almost like AlphaGo in real time. And so I think that's probably around the corner. These agentic systems is likely to cause token rates to go way up.
The demand for computing is really, really high right now, as you guys know. Because of the 3 scaling laws, reasoning and agentic systems is putting enormous amounts of pressure, multimodality, larger models is putting an enormous amount of pressure on the training systems. On the inference systems because of long thinking, putting enormous amount of pressure. And then all of a sudden, agentic systems comes along, putting enormous amount of pressure on the -- and then because these models are now so useful, the number of AI start-ups is going through the roof.
If you look at the number of AI start-ups from 2025 versus the year before, it almost doubled. And the amount of money they raised, most of it is going to go towards compute, right, so that you can be an AI company. And so the amount of burden on -- the pressure on the computing companies is really high. That's why our customer demand is so high.
Jensen, Ruben Roy from Stifel. We've talked about Extreme co-design quite a bit. And the stats are staggering, right? The Spectrum-X run rate, getting to 400 gig SerDes. What does that say about R&D intensity going forward? And are you seeing tangible benefits today from some of the things you're doing with either the EDA companies or Cursor? Is that impacting your R&D?
And then I guess the last part of that question is just your M&A philosophy and just thinking through some of these recent deals, Grok and Enfabrica and how you're thinking about M&A relative to getting and accelerating the design process?
Our -- if you look at all of our investments, you could see us investing at almost every layer of the stack. From land, power, shell, chips, infrastructure, models and applications. We either invest in it directly, meaning we're building it like, for example, NVIDIA's open models. We're one of the largest model builders in the world. We just don't ever talk about it. And now we're getting so much attention in this area, positive attention in this area. You'll hear us talking about it more. And it also explains why NVIDIA's cloud spend is so high because we're building the world's leading models for open science and all these open markets. And so -- but that creates such an incredible flywheel for our architecture. It's one of the best investments we have. And so you see us invest across that entire stack that way.
You then see us invest across the industry. Like, for example, we invest from digital biology to agentic systems, all the way to robotics and autonomous vehicles. And so we'll invest across the industries. And then we'll invest across the dimension up and down the supply chain. So when you see us invest, and I conflate R&D budget with invest because to me, it's the same. It's improper that most people split the 2. It actually -- investment is investment. What difference does it make? You're investing internally or externally to enhance your company's market position. Isn't that right? That's our fundamental goal. That's what all of you would like me to do. And so I see this large investment portfolio in this larger universe in that simple way.
Where we could do something very uniquely, we'll likely do it internally and bulk it up. Like, for example, NVLink is clearly revolutionary, and we're singularly the only company in the world with it today. The first version of it is likely going to be scale out Ethernet called scale up Ethernet, and it will work out the same way. And -- but here, it's such an incredible capability and the rhythm of our company is so fast that it made sense for us to bring like-minded people to our company, and this is where Enfabrica came along. Rochan and the team were always thinking about scale up, and they just described it in different ways. And their market traction was difficult because it's hard to build a company that does scale up.
The compute and the fabric are highly integrated. The software stack is completely integrated. And so it's hard to disentangle them into different companies. Some of the things that I'm thinking about with Grok, it's hard to be disentangled. There's a lot of innovation and invention that has to happen at the 2 teams level. And so there, again, I think the teams out has just joined us in that particular case. But to the extent that we can invest outside, I prefer that. I prefer to keep NVIDIA as small as possible as large as necessary. And if you look at our company today, we're what 40,000 people. We're probably one of the leanest fighting machines on the planet. And yet our ecosystem is really sprawling. And that's the size -- that's the mental model of a company of the future. And so I think the level of R&D spend of our company is quite sustainable.
Natalia Winkler, UBS. So one question I want to follow up on Grok. I was wondering if there's any software technology from that deal that might be -- you could leverage across the NVIDIA portfolio broadly?
Yes, no doubt because the programming model of the Grok chips are very different than the programming models of our chips, which is the reason why they're extremely low latency, and we're extremely high throughput. Some of the things that I'm thinking about doing would rely on the reshaping of some of that stuff. But we're still -- we're just really at the beginning part of it. We have plenty of time to go do this. And if we succeed in doing it, it'd just be yet another dimension of capability we can bring to the world's AI factories.
I think the vast majority would still just be things like Grace Blackwell and Vera Rubin just be, call it, 90%. But maybe in the future, 10% of it is an extreme version of something. It's kind of like it's okay to build a mid-engine and an SUV. It's okay. It's still, both Ferraris, right? And so I think what NVIDIA does is not build GPUs. What NVIDIA does is build AI infrastructure. Are you guys following me? I barely showed you a GPU today. I stood in front of a pod. And so our goal is to build AI infrastructure that is insanely amazing, not just for 80% of the world, but -- or 87%, but hopefully for 100% of the world. And even use cases you didn't even know about yet.
Okay. I believe you have time for one last question.
Ken from Robocap. So it's a question about both margin and technology. So actually, you currently already have CPX technology and through the acquisition of Grok, you also can have the access to SRAM that can be used in inferencing. So actually -- so actually, your team also published a paper so a month ago, so about how to use CPX in GPU and so as to reduce the usage of HBM because you can use GDDR7 instead of HBM. And we all know that HBM is very expensive. So going forward, so because of the combination of Grok and your in-house CPX technology, so how would you see your usage of HBM? Can it be more under control so as that could be more positive to your margin going forward?
Sure. So I can describe -- I'll describe the benefits of each one of these things, and then I'll describe the challenge, okay, why it's not so obvious. For example, CPX does prefill per dollar better than like, for example, Rubin CPX versus just Rubin, it does prefill for dollar better. So it's prefilled -- it's perf per dollar is higher than Vera Rubin normally. And if I keep everything on SRAM, then, of course, I don't need HBM memory. And -- but the problem is then the size of my model that I can keep inside these SRAMs is like 100x smaller. okay?
However, for some workloads, it could be insanely fast because SRAM is a lot faster than going off to even HBM memories, okay? And so I think you can kind of see the benefits in prefill and decode. The problem is workloads are changing shape all the time. And sometimes you have MOE, sometimes you have multimodality stuff, sometimes you got diffusion models, diffusion models, right? And so sometimes you have auto regressive models, sometimes you have SSMs. These models are all slightly different in shape and size. And sometimes they move the pressure on the NVLink, sometimes they move the pressure on HBM memory. Sometimes they move the pressure on all 3.
And so my point is because the workloads are changing so fast and because the world is innovating so fast, that's one of the reasons why NVIDIA is just universally the right answer because we're flexible. Does that make sense? If your workload is changing from morning to night and depending on what customer you have, we're equally -- we're versatile. We're good at almost everything. And you might be able to take one particular workload and push it to the extreme. But that 10% of the workload or even 5% of the workload, 12% of the workload, if it's not being used, then all of a sudden, that part of the data center could have been used for something else for 90% of the workload and you deprived it because you only have 1 gigawatt.
The trick is to think through that one data center, not as infinite money and infinite space, but you have finite power. And so you have to utilize that finite power for the overall consumption of the data center. And the more flexible it is, the better it is. The more unified architecture it is, like, for example, if we updated a new DeepSeek model and every single GPU in the data center, all of a sudden, every one of their performance go up. I uploaded the -- updated the library for our Qwen model and the whole data center go up. I do it for -- does that make sense? But if you have 17 different architectures, one is good for this thing, one is good for that thing, then as it turns out, the overall TCO is not as good. So that's the challenge. And even when I'm building these things, I know what the challenge is. It's very hard.
I'm exploring ways to beat myself all the time, and it's hard. We're exploring all kinds of different chips that try to build a better solution than even the one we have. It is extremely hard. But the exercise is worth it, isn't that right? If I'm constantly trying to come up with a new way to even do better than what I currently have, if I'm doing that myself, then I'm exploring all of the nooks and crannies. I'm trying to disrupt myself all the time, okay? So I think that at the moment, CPX is exactly as you say, but it also reduces the flexibility of the data center. And I think with that, let's see, what should I tell you guys?
One, demand is really high. If I didn't say that earlier, demand is really high. Two, Grace Blackwell GB300, the transition has been wonderful, and everybody is building it out. Vera Rubin, next generation is here, another giant step up. It couldn't have been possible without co-design, took us all these different chips to make it possible. Second half of this year, we'll start shipping. And next year will be the Vera Rubin year, and we'll be shipping at scale.
In the meantime, I think that you should know that the AI community is doing really, really good work. ChatGPT was followed on by o1, which is completely revolutionary, which triggered reasoning models of all kinds. Open models is now the second largest model in the world. That's probably the best way to think about it. And our company is -- has been doing this for some time, but we are a giant model builder as well. And we build models and do software that we share with the entire community. And I think this year, Agentic AI and Physical AI are really going to hit strides. And so have a great year. I wish I had voice for you, but I'll go -- I don't know where I left it, but I will find it. All right, guys. Happy New Year.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Special Call - NVIDIA Corporation
NVIDIA — Special Call - NVIDIA Corporation
🎯 Kernbotschaft
- Kernaussage: NVIDIA betont System‑ statt Einzelchip‑Führerschaft: Vera Rubin (neue AI‑Factory‑Architektur) ist in Full‑Production, Grok-/Enfabrica‑Zukäufe ergänzen Spezialfähigkeiten, und BlueField‑4 plus neue Speicher‑Tier adressieren steigende Inferenz‑ und Kontextanforderungen.
🔺 Strategische Highlights
- Vera Rubin: Voll produzierte Pod‑Architektur; Ziel: breit ausliefern, Serienversand ab 2. Hj.; Fokus auf Co‑Design (GPU, CPU, NIC, Switch, Software).
- Infrastruktur‑Stack: BlueField‑4 als neues Netzwerk/Storage‑Element (KV/Context‑Tier) zur Reduktion von Latenz und Netzlast; langfristige TCO‑Strategie.
- M&A & Ökosystem: Grok (ultra‑niedrige Latenz) und Enfabrica (Scale‑Up‑Fabric) ergänzen; NVIDIA bleibt Plattform für alle großen Modelle und fördert offene Modelle.
🆕 Neue Informationen
- Produktstatus: Vera Rubin «in full production»; Colette Kress nennt Markteinführung/erste Rev‑Rec‑Effekte in der zweiten Jahreshälfte; BlueField‑4 und ein neuer Kontext‑Speicher‑Tier wurden vorgestellt.
❓ Fragen der Analysten
- Performance‑Vergleich: Nachfrage nach Benchmarking vs. TPUs; Management verweist auf MLPerf als Referenz, direkte TPU‑Vergleiche kaum öffentlich verfügbar.
- Ramp & Margen: Wie schnell Vera Rubin skaliert und welchen Einfluss das auf Margen hat; Management erwartet schnellen Ramp und betont Wert‑getriebene Preisgestaltung (Margins bleiben robust).
- Supply & China: Versorgung, DRAM/HBM und Exportlizenzen (H200 für China) wurden thematisiert—Management versichert separate Lieferströme und ausreichende Kapazität.
⚡ Bottom Line
- Fazit: Das Event liefert operative Substanz: Vera Rubin ist produziert, strategische Zukäufe erweitern die Plattform, und neue Netzwerk/Storage‑Bausteine adressieren das wachsende Token‑/Inferenzproblem. Für Aktionäre bedeutet das erhöhte Upside‑Potential durch schnelleren Produkt‑Umsatz, aber auch Abhängigkeit von erfolgreichem Ramp und Supply‑Execution.
NVIDIA — CES 2026
1. Management Discussion
Welcome, and thank you for standing by. I would like to inform all participants that this conference call as well as any Q&A may be recorded where a company is presenting any recording may also be posted on their website. Views and opinions expressed by any external speakers on this call are those of the speakers and not of JPMorgan. Parts of this conference call may be reproduced in JPMorgan Research. If you have any objections, you may disconnect at this time.
Unless otherwise permitted by internal JPMorgan policy, members of JPMorgan Investment and Corporate Banking are not permitted on this call and to disconnect now. I would now like to turn the call over to your host.
2. Question Answer
Thank you. Good morning. Happy new year's, everyone, and welcome to JPMorgan's virtual fireside chat series at the 2026 Consumer Electronics Show. My name is Harlan Sur. I'm the semiconductor and semiconductor capital equipment analyst at the firm.
Very pleased to have Colette Kress, Chief Financial Officer of NVIDIA here with us this morning. It's been a tradition past 12 years that have Colette and the NVIDIA team kick off the investor event here at CES. Colette's to start us off with an overview of Jensen's NVIDIA live event yesterday, and then we'll go ahead and kick off the Q&A. Colette, thanks for joining us today. Happy New Year, and let me go ahead and turn it over to you.
Okay. Let me first start. As a reminder, folks, to this discussion may contain forward-looking statements and investors are advised to read our reports filed with the SEC for information related to risks and uncertainties facing our business. And then I'll kind of get back to CES and our announcements essence that we were here yesterday doing.
It's an important time for us to remind everyone about the transitions that are taking place in the market today. Those are 3 different transitions and all very important ones. The first 1 is one that we have talked about for several years regarding the need to move to accelerated computing. We're beyond the ability in our current development with using CPUs to advance that work in just the CPU. So folks are moving to accelerated computing throughout the world.
Secondly, the development of generative AI is also a key transition. Those are things that are changing a lot of our work today, whether it be search or any of the social media or otherwise, generate AI is also taking part.
But in the future, we also see the third and important transitions we move to agentic. Agentic AI is really where it is getting work done, work that can augment the work of many employees, many of our folks at home. All locations are really important, we think, going forward. Those transitions are penetration to that, and they're all occurring in creating an exponential growth in terms of our computer.
So that's one of the opening statements that we just kind of want to remind in terms of what we see in AI going forward, but also seeing that we're doing in terms of accelerating today. This event highlights a lot of different focus on not only just AI, and AI for business, but also the work that we are doing in terms of with robotics and really thinking about physical AR going forward.
But an important part of the discussion was talking about our next and upcoming version Vera Rubin. Vera Rubin, as we discussed, has definitely taped out and is ready to go, but this is an opportunity to help folks understand that we are well in good shape in terms of bringing this to market in the second half of the year as we are in full production.
The important part of Vera Rubin, as we discussed that it is different chips. And I think it's important to talk about that -- what that means in terms of different chips, 6 different ships that have been extreme pre-designed to create a data center infrastructure at scale. This isn't about coming gear and talking about one different piece or discussing that says we are designing and orders building out the rack. It's more than that in terms of the design that every piece continues to be fought through its work between each and every single one of those different types of chips.
So 6 chips that we're talking about. First, of course, that Vera Rubin, our GPU that is Vera, our CPU it's our next version of greatest piece of what we can do in scaling up in terms of our NVLink. It also takes us to Spectrum-X in terms of what we have in terms of the super mix, but also what we have with Bluefield and then also our switch for CPO. Six different chips have all been harmonized in terms of what we are bringing to market.
We're excited in terms of all the different workloads that would be able to support but some of the key things that we have seen already is to understand that this is a full system that will essentially be able to take the time to drain down to 1/4 of what we had in terms of Blackwell. Additionally, you have the capability of 10x higher throughput and then thirdly, and an important part in terms of the inferencing phase that we say, it's actually 1/10 lower open cost throughout.
So these parts and bringing that together, we are getting ready for that to continue to scale in the second half of this year. And then we'll be in full ramp as we move into the next calendar year as well. So those are some of the highlights, and we can talk about more of it in the discussions.
Yes. No, that was a great overview, Colette. And Jensen spent quite a bit of time yesterday focused on physical AI. And the team has framed AI physically as a massive opportunity by powered by platforms and bottles like Cosmos, Omniverse, Isaac, right? And vertical-specific frameworks like Group and Alpamayo, right?
Customers are already here at CES. They're already bringing robots in many different verticals to market using Cosmos and Group. The Mercedes announcement yesterday is leveraging the Alpamayo-based reasoning model, right? Is physical AI -- is this already a financially material contributor to your data center revenues? And how should we think about the growth curve over the next few years for physical AI?
Physical AI is yet another great opportunity once we advance the agentic AI. And you're correct, they all are different types of models that are going to be needed for the physical AI. The important part of what we brought to market and what we discussed about is really the need for the open source model. And right now, if you think about the top proprietary models, the next in line is the formation of all of the open models and how important these are.
Now these open models are important definitely for the enterprise and the work that they're doing, but being able to manage for physical AI, the abundance of modeling there coming up in store and what it was being designed, whether that be for research and whether that be entry to developing the content for them, those models are now in service and here today. So here on the CES floor, even here in terms of our offering, we have full. We're about visibility but also what you have in terms of automotive.
Your question stems in terms of are we seeing that today? And yes, Mercedes is coming to market in terms of their very hard work that we have done over the last 8 years to move to a very high-end, self-driving capability in the car, really focused in terms of the safety and the lock. The Mercedes have now been able to take the lead as one of the safest cars that will be in the market.
So yes, we are earning definitely revenue from our work in terms of Mercedes as well as many others that are using our platform, whether that be back in the data center, and that's an important piece to keep in mind the amount of data that is selective and put together in terms of the data here as well as what is also inside of the course as well.
As we move forward, taking that to an area such as physical AR for robotics is also going to be extremely important. The learning, the simulation of ours of what we've seen in terms of automotive carries very nicely for the purpose in terms of what we will be able to do with robotics as well. So yes, achieve part of that, we see much work in terms of our Jetson platform, our Omniverse platform and then also now in terms of our open model helping these important parts of physical AI.
That's great. And you touched upon a very relevant topic in the opening remarks, which is this is a team is in production with your next generation Vera Rubin, an accelerated compute platform, on track to launch in the second half of this year in line with your aggressive product cadence, 6 chips, as you mentioned, in the Vera Rubin portfolio, initial performance relative to Blackwell is very compelling, right? 5x better performance, 3x better training performance.
And as you mentioned, and most important to your customers the 10x lower potential cost per token. As you look at the strong demand curve ahead of you and we've all heard about -- we all track value chain, the supply chain. But if you look at the strong demand curve ahead of you, what are the product areas or categories of the supply chain that you could see constraining your shipments as you start to unlock Vera Rubin in second half of the year. Could it be 3-nanometer wafer supply? Would it be coLOS? The memory, any bottlenecks that you foresee as you think about strong demand ramp in the second half of the year?
Yes. I think it's right to indicate, yes, there's tremendous amount of demand that is out there for both the AI and telecomputing. And we have been focusing on the significant amount of demand. And then the need of what type of supply we'd have to purchase.
Keep in mind, the work that we deal in terms of building any one of these data center infrastructure systems, from the very beginning to the very end. You could be anywhere from 3 quarters to a year to be completed. That means a lot of our supply purchasing is not taking place in terms of what we need for tomorrow, today. It has been in the works, in the works for a couple of years because what it takes is focusing not only on just our supply, but the capacity needs that they have wanted and that is an important part of our processes, thinking through every stable one of our generation and our future generations and working with our suppliers.
We feel very solid about that in terms of what we see in this new calendar year and what we have in terms of supply. As we move forward, it's something to think about as more and more growth goes, how much more can our suppliers did. But we feel good in terms of what we have ordered, what we've have been confirmed for and in terms of our supply that we will take for this year.
That's perfect. Why don't we take a step back for a second as we enter the new year, and does it build the concern focus as it relates to NVIDIA in terms of how the market thinks about the engineered team and the trajectory of growth is as we step into the year, the market is always focused on. And by the time that we second to the year, we already have a pretty good view of customers' CapEx in being change, right?
So the market -- as the market always is very forward-looking, right? And I think the market is starting to think about the infrastructure growth trajectory looking into calendar year '27, right? And if I go back to October of last year when Jensen talked about $500 billion of visibility backlog to found in '26, right? That's both on Blackwell and Rubin GPU fabs, right? And we know that lead times may lack scale based solutions are 9 to 12 months.
And it takes a significant amount, as you mentioned, supply chain management and coordination, capacity buildouts, et cetera, right? But the best proxy, I think, for continued CapEx and infrastructure spending, buyer, customers, if you look at your customers' forecast and orders beyond '26, right, which I assume NVIDIA team is already focused on. I'm not asking you to quantify, but given what you see in your orders and customer forecast, are you already seeing a continued spending growth profile by your customers into calendar?
Yes. So let's go back in terms of our GTC D.C.. That was an opportunity to help you understand that the combination of Blackwell and Vera Rubin together is about a half a trillion through that period time of through '26. But the important part, correct, is thinking of now let's start talking about 2027. And you think that would take to stand up the compute and up a full stage data center.
That is years to do so from the land power shell to finishing up the buildout to eventually in terms of putting in and compute and getting that ready. So where we see our customers and they can see an event like today, they know that Vera Rubin is here. There's already been discussions in terms of how can we think about the amount of demand and where they will put that in their land power cell that they have up and coming in terms of the year '27. So that's the right way to think about it.
We're still working on in '26. There is still a shortage of demand, and they are still looking to are ther quick ads that we could also add in '26 to help fuel what we need in terms of our demand. So both of these things are happening at the same time, but this is being very hopeful to them. They have good understanding from an engineering, what's capable and now they can start thinking through the volume of what they will need for their data center builds. So yes, that is exactly where we're focusing on is on as well.
On the market concerns around an AI bubble, Jensen, and as you mentioned in your prepared remarks, right, you've articulated 3 compute platform ships that are all happening at once which should mitigate a spending level, right? And often feel like the market sort of message is.
And first is the transition from CPU compute to GPU accelerated compute, right? I mean we're seeing this in so many traditional CPU-based compute workloads and dominated segments of the market where, over time, they're moving from CPU compute to GPU accelerated compute rate. Jensen always talked about this, but EDA, chip design software is a perfect example where most of the chip design software workloads were run on high-performance server CPUs not that long ago.
But today, they are all -- many of them are running up GPU accelerated compute architectures. You see that in the simulation market. You see that in the data base markets and so on, right? So that's one of the first sort of transition, right, the CPU to GPU accelerated compute in the existing traditional compute base. The second driver is, as you mentioned, the strong adoption of Gen AI. And the third transition, again, as you mentioned, is agentic AI. And of course, the onset of new foundation models that will power things like physical AI, right?
So we stand along all 3 of those compute platform shows like where are we in terms of the adoption curve contribution to your current data center revenue profile? What specifically looking into 2030, right, let's take a longer-term view on this, but looking into 2030, how are all of these 3 shifts, how are they going to profile into that sort of $3 trillion to $4 trillion of data center spending that the NVIDIA team is forecasting during that period of time?
Great set of questions. First, looking at the accelerated computing. Accelerated computing, it's already here, and many of us have seen it and working with it almost every single day. There's a massive transformation of how search is completed, recommender engines, and essentially almost all in terms of the consumer Internet and how we market through our 2 businesses and our consumers. That's an important piece.
But keep in mind, it is going to be in multiple decades solution to try and get throughout all. There's a lot amount of moving to a software 2.0 and transitioning from CPU to software to a different form of accelerated computing to the software. So we're in the early parts of it. It's moving quite fast. Folks do see the great benefits for the accelerated computing and being able to manage with a significant amount of data there are, you're going to see some time moving forward.
However, moving also in terms of our work that we see with generative and agentic AI. The important part of that also created in an exponential growth in the need for the amount of compute that's necessary. Because one of the very big part of moving to agentic was the long thinking, was what can I do to get a response on a very difficult challenging question and that additional long thinking takes a lot more inferencing demand and takes a lot more token generation as well.
So we are also now seeing a surge in that demand as we move forward. And our vision can see looking at AI as we go forward, has nothing more in the early stages as we move towards these various statistics, data solutions that will augment a lot of the work that we do in our offices as well as we do in terms of personal life. So we know these big markets are driving a lot of this different demand. And in no side do we see any type of shortage or any type of stopping from that.
There's a lot more work to get completed. And the world as a whole still has to get that completed, not just here in terms of some parts and what we see here in the United States. We have a lot of different sovereign AI going on and so that we have many, many different industries.
You have to go industry by industry. You can look at social media, but you have to look at health care, you have to look at automotive, you have to look at industrial, manufacturing. All of these different have unique ways for a perfect work that has to go transition and can be introduced in terms of AI as well. So a lot still to go. And why we indicated that by the end of the decade, we are definitely going to be up there in the multiple, multiple trillions, in the 3% to 4% of the amount that we'll be able to spend in terms of building out the accelerated computing and the AI types.
Maybe more near term kind of focusing on calendar '26. Going back again to Jensen's comments at GTC back in October when you talked about just $0.5 trillion of revenue is in those backlogs of cumulative Blackwells and move in shipments to '26. Obviously, as you move forward in time, you continue to get updated forecast and orders.
Ex China, let's talk about China a little bit later, but ex China, has that $0.5 trillion worth of visibility and backlog number through '26 continue to improve. And at what point are you supply constrained and need to push any more orders into.
So the demand as we see continues to increase as folks are to looking to enable more compute for a lot of areas, the long, tough time and thinking. And so we see this every single day and since our time that we said $0.5 trillion, of course, we've seen new announcements of new deals, new different both focused in terms of the CSPs, the law makers as well as many of our new cloud looking to add more on to that.
So yes, more has occurred, and we are now starting to see folks work in terms of providing the orders. We have orders for Vera Rubin and focusing more and more in terms of thinking out a full year of volume, what you may need in terms of Vera Rubin. So we're in a great position in getting better understanding. We've worked over the many, many years that has the more insight that we provide them in terms of our infrastructure is there, the easier it is in terms of the planning and process of that.
So their demand needs are quite strong, and we are definitely in that process. So yes, that 500 -- that $500 million has definitely gotten larger. And now we'll probably look in terms of next year as well to start building up in terms of all the different demands that we have there. But we cannot say anything more than demand is quite strong.
That's great. No, that's exactly what we're looking for, and that's exactly what we thought. Maybe switching gears because Jensen and the team did a great job, and you did a great job of laying out the performance specs, as I mentioned to you before, right, 5x inferencing performance on Vera Rubin versus Blackwell.
That's on the inferencing side, 3x better training? And then what's most important is the economics to your customers and you guys are driving 10x lower cost per token on Vera Rubin versus Blackwell, but I think the market has gotten a better appreciation for -- you talked about codevelopment and as you bring more systems and rack scale solutions to the market.
It is a solution that is optimize not only around compute, it's optimized around compute. It's optimized around networking. It's optimized around storage and networking right. And so let's talk about networking, right? And lots of focus on networking lately, especially as NVIDIA and the initially transitioned to rack-scale solutions. There's a significant step-up in networking dollar content, given the scale of connectivity with your NVLink networking and switching portfolio, networking attached to your compute revenues was around 19% in your fiscal Q3 of last year. And we define networking attaches networking revenues divided by competed revenues, right? That was about 19% in Q3, at up to 21% in the July quarter.
So on average, about 20% networking attached to your rack scale compute systems, here then the average attach over the prior 9 quarters, which was around 7%, I think, due to the scale-up adoption, right as we move to rack-scale. Looks like you continue to also get traction on spectrum ex your Ethernet product line, is 20% of baseline on networking attached? And as you drive more spectrum and your recently announced Spectrum-6 platform and you've got some GS for scale across, maybe the mix trends move more towards below the mid-20% range in mid- to longer term, right? I'm not sure, but I wanted to get your views on that.
Yes. It's a great way to start here talking about our network. We can definitely discuss where we've been historically and where we see going forward on the networking. One of the ways that we have been looking at the networking is how much in terms of when they are buying the full systems, which always all of them are, how many of them are attracting in terms of networking. And that's a different than looking at it from a dollar perspective, but just the attach rate. It is -- that is a very, very clean metric to understand. That number is nearing 90%. 90% are attaching strong form of all the networking included in there.
Let's remind folks that as our networking business is #1 in the world. From moving to a very, very small scale. But now with the full development of all different types of switching capabilities, best agreed in terms of NVLink. Nobody has even figured out how to even do a lot of what we've done is really establishing both adoption of not only our InfiniBand, which has been a important part for super computing for decades and decades, it is world-class, but the quickness of providing those key features in Ethernet and the adoption of our Ethernet for their businesses as well has been a huge success kind of stepping back and looking at this AI important way.
It's not enough to just have a GPU check, it's not enough to how to base. You're missing such an important part of what the networking does to capture the capabilities of scaling the multiple and multiple ones together, but also dealing with the complexity of traffic and the complexity of responses that you need at some point, we needed training and some point we may be able to manage that all with all of our different inferencing platforms with our networking has been a huge success.
So even as we go forward and move to Vera Rubin, already working at some of the most important capabilities and how important that networking has been there. They are also part and focusing in terms of our work in terms of the switch for CPU. That's been an important part of those to know the amount of savings and capabilities that you can establish through a CPO environment, and we're going to be excited to go to market for them as well.
But really looking at what we see, it's very interesting. Even if they have a part of our compute, very common in terms of networking is still being chosen for different systems. Even if they have one of their own ASICs, they will often use our switching capability as well. So we're in a full design at end-to-end, and we're really excited in terms of how the networking has also been established within Vera Rubin.
Yes. And as a reflection of the traction on networking the team announced its -- you've always been a leader in band switching, right? And as your customers were clearly signaling to the NVIDIA team that they were moving to more of an Ethernet-based switching from the team bought the market has to go your Spectrum Ethernet switching platform. That went from like 0 to $10 billion to annualize in like record time, right? And I think that last you updated us your annualized run rate on Spectrum X was like $10 billion annualized. I think that was in the July quarter.
And the October quarter, that looks like that, that stepped up to sort of annualized run rate for your Spectrum some platform. Jensen and you and the team announced their next-generation Spectrum-6 platform, right? This is 120 terabits per second throughput switch, right? One of the fastest switches in the world. You're bringing that to market with Vera Rubin, right? So if you think about the $12 billion, $13 billion sort of annualized run rate in the October quarter, you've got a new platform coming out of Spectrum-6. You look at your order book for Vera Rubin. Like where could this number on Spectrum be as we move through -- as we move through next year?
So not getting a forecast going forward, but to understand where we already are in terms of the attachment. We're going to see something resonate in terms of our growth in terms of consumer and our growth in networking data time. The only difference that you do have is just the timing of when each of those systems are put together in a full data center infrastructure that they're doing.
You may have -- parts of that networking is the first things that are put in place in terms of the data center and with some of the last part of the data center as networking, that's the only thing that really changes the growth. But so we are expecting nearly these things, not more of an attach rate in terms of what we are seeing in networking and growth moving forward.
Great. And then maybe switching over to China. I know you've got some questions yesterday in the financial analyst Q&A. But following the U.S. government's approval of the H200 sales into China, it appears customer interest actually looks very strong way. So the question is, has the team started receiving orders from approved China entities for the H200? More importantly, how rapidly can the team start shipping H200 to these customers? And how should we frame our kind of revenue opportunity over the next 12 to 24 months?
What I remember Jensen had previously last year quantified the China revenue opportunity for calendar '25, that $50 billion growing at a 50% CAGR, right? 50% growth implies $75 billion of potential revenue demand for NVIDIA this year. Is that how we should think about the China revenue and growth profile and opportunity?
Great question. Let's first talk about the H200. We're very pleased that the U.S. government saw that this was the right opportunity for us to fairly be able to compete worldwide and providing a really good product to China. And that's what this is all about. The ability for us to ship H200 to our customers still requires a license from the U.S. government and the U.S. government work tediously right now on that process in order for them to determine the licenses for the customers. So the customers have requested the licenses, and we are now awaiting that part of that.
But also on the same side, we have heard from these customers from a demand perspective. That's important for us so that we can prepared as those 2 things come together. The POs and the completion of the licenses with the U.S. government will set us on our way to begin shipping the H200 to China. We hope that, that gets done soon.
But again, it's not all something that we can right now control, but we do are very pleased in terms of the U.S. government's decision to do that. So we're going to wait and see what will happen. It kind of steps back though and says, what is the demand in terms of China, it's a very, very important economy and has a tremendous amount of strong engineers and AI engineers compared to also what we see here in the U.S.
So it's also a very big business as Jensen articulated, and it's not a static business. It's going to grow very similar in terms of what we are seeing here in the United States. If we can continue selling, going forward with any of those different licenses that U.S. government has. So more to be determined at that, but let's just wait to see how we can get our H200 out.
Got it. And then on the recently announced a nonexclusive licensing deal with Grok, Grok was focused on this SRAM-based, high-throughput inferencing engine. Very good for low user count and low model parameter influencing, seems like more of an enterprise-focused solution versus NVIDIA's inferencing fronting solutions, which focuses on very high user cloud, massive contact input capability, right, more targeting foundational model developers. I wanted to get your views on the rationale for the Grok transition? And how NVIDIA thinks of integrating their technology into your product road maps and target markets?
We're very pleased to both have the Grok IP with us. And that's what we created with an IP license stemming from Grok and their pieces. But the other most important part of it was an exceptional team that has now joined us as well. You are correct, their work in terms of inferencing, low-latency inferencing has been a lot of work that they have done. We're seeing tremendous engineering horsepower to do so. We found it is quite exciting and something very similar of our thoughts and work going forward as well.
Bringing them onboard with that IT were excited in terms of what the teams could work together. So excited we got it done before the holiday. We have that completed and we're already with -- many of them are already with us beginning that work. So stay tuned. We don't have anything yet in terms of the exact timing when something will come to market but this is an important area.
The complexity of inferencing, the size of inferencing interest market and different needs there's going to be and being such an exceptional team, we will be able to put something great together.
In terms of some of the market concerns that we continue to hear about, right, and one of them is the concern around the gap between a few of the foundational model builders and the current financial profiles and the data center compute capacity, right, that they've committed to over the next pages, OpenAI, Anthropic, et cetera, right? They're committing to a lot of capacity to you, competitors, some of the large hyperscale. Obviously, these AI labs will have to raise money, right? So how do you think about the risk to NVIDIA's business?
The model makers are very both foundational model makers, but also in terms of open source models as well. Most of them, if you look at them as a whole are being a very methodical piece by piece as they continue building a new training model. Okay, let's move to the inferencing and now let's get started for my next and moving in that methodical way. Many of them have had and worked in terms of how do I source to raising of cash, the raising of equity, the combination of the 2 and how do I work that carefully either with the funds or looking at it in terms of on theirselves.
I think a lot of that is very solid diligence in terms of what we'll probably see continue going forward. They are essential. These foundational models are essential from a concept's perspective in terms of what we're going forward. So working and forming and storming with how to get that completed. I think it has gone very well. Sure. They're looking in terms of long term to help us understand. This is not our ability to complete AI in the next couple of years.
This is decades. So they may talk about it in terms of gigawatts of size as we go forward. But the reality is, it's really about the year by year or quarter by quarter, how do they need to build, where do they need to build? Are they in the research side? Are they looking in the inferencing? And I think that process is fine.
Many of them are also with the CSPs. That's a very big help for them. Their -- quality of what the CSPs can provide for them so that they can concentrate on building up their models is a great combination, and we're happy to support that. And many of the work that we are doing is through the CSP and therefore, the model makers is sold, whether those CSPs be in their cloud or some of our long-standing tremendously great CSPs that we've had. It's working quite diligently in terms of all that work. So I think we're going to see more of that to come. But again, we just have to take this day by day, step by step and start to rethink about what they're planning to put together.
Colette, we're at the Consumer Electronics Show. And the one thing that we noticed was a distinct absence of new GeForce gaming platforms this year. And then I guess the question to you is are there concerns of continuing supply of DRAM and HBM memory for gaming? How are you prioritizing allocating these components, gaming versus data center? Do you think that there is potential for demand destruction in the seasonally stronger second half of the year given that especially DRAM pricing looks to continue to increase to the remainder of this calendar year?
Our gaming business has been a homerun where our representation with our gamers continues to be tremendously strong and coming out with what we had with Blackwell was also hitting great strides. At the very beginning, we underestimated in terms of that growth. And that growth was so fast at the very beginning, but we have now brought that up to good level. But given our size of where we are as a percentage of our gaming markets, we're going to contain some both of prioritization, what will they need as we go forward.
But still more in terms of later on in terms of this year and next in terms of to focus. But the best part that we're pleased about is these platforms and enables creative and AR type of platforms that they can use are really an important business model. So stay tuned as we think through, demand is, again, quite strong. And we're going to try and make sure it will not serve as much as demand as we can.
And then my last question, and I appreciate the time spent here. You've guided to mid-70s gross margins, while acknowledging, right, to the potential for rising input costs looking into this year levers matter the most to compact the margin, you get mix? Is it pricing, cost downs? Is it supply chain efficiencies? And where are you least willing to compromise as you think about on these levels?
Yes. It's always an interesting discussion on all the gross margin piece of that. It really showed a focus of us not just getting the confuse out, but doing it very in a great position, both with our manufacturers, our suppliers and in terms of our internal teams in terms of how we can do this well.
We have split very close right now at that mid-70s right now. We don't want to look at this as, yes, we're here to grow, grow, grow that higher. We are here to keep what we said as it is it's mid-70 days right now as we go forward. It takes a lot of different banks. When you work at the complexity of the system, you are focusing in terms of every last patient component. We have already done a significant amount of reordering.
We do understand what it took for the capacity of many of our suppliers and we're very supportive the many different suppliers that have pulled that together. But that now moves us working together with manufacturing. How do we improve that cycle time? How did we think about improving all of the different focus of the business as a whole? Not only can we do better and focus on that cycle time, we did also improve the cycle time of them just getting that to customers and the faster the customers.
Remember, as we move into this new year, we still have a combination of different platforms that we're building. It's not just one product. And that will both enable and also be a mix, so we have to keep in mind as we move into this new year. So right now and what you've seen all of our steps for Vera Rubin as well as what you see with GB300, very serious in terms of that process and getting that together. So we do feel that confidence that will also be something that we can work well. But let's not look at it as something easy. We will continue to work to stay about that same page.
Absolutely correct. We're just about out of time. I want to thank you as always, for your participation and your support. We look forward to strong growth ahead this year for the NVIDIA team and another solid year of execution by the team as well. So thank you very much for your participation and support.
Thank you so much. Have a great day.
Thank you.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — CES 2026
NVIDIA — CES 2026
📌 Kernbotschaft
- Kern: NVIDIA sieht drei langfristige Markt-Übergänge: CPU→beschleunigtes Computing, generative AI und agentische/physische AI. Management betont, dass die nächste Plattform "Vera Rubin" tape‑out hat und in H2 in Produktion geht; damit verbunden sind starke Performance‑ und Kostenvorteile gegenüber Blackwell. Networking und China‑Lizenzierung bleiben zentrale Realisierungsfaktoren.
🎯 Strategische Highlights
- Vera Rubin: Sechsteilige Rack‑Skalierung (GPU, CPU, BlueField, Switch, Spectrum‑X u.a.) als integrierte Systemlösung, Fokus auf geringeren Energieverbrauch und höhere Durchsatzraten.
- Physische AI: Omniverse/Isaac/Jetson + offene Modelle als Hebel für Robotik, Automotive und vertikale Anwendungen; erste kommerzielle Kunden (z. B. Mercedes) liefern Umsätze.
- Networking: Hohe Attach‑Raten (~90% laut Management) und Spectrum‑6 als Wachstumstreiber; Switches und NVLink erhöhen Dollar‑Anteil am Systemumsatz.
🔭 Neue Informationen
- Produktstatus: Vera Rubin ist tape‑out, Ziel: Produktionsstart H2; Management nennt bis zu 5× Inferenz, 3× Training und bis zu 10× geringere Kosten pro Token versus Blackwell.
- Supply/Lizenz: Firma sagt, firmierte Beschaffungspläne liegen vor und man fühle sich "solide" für 2026; H200‑Verkäufe nach China hängen weiter von US‑Lizenzen ab.
- Akquisition/IP: Nicht‑exklusive Lizenz und Teamübernahme von Grok (SRAM‑Inferencing IP) — Integration angekündigt, Markteintrittszeitpunkt offen.
❓ Fragen der Analysten
- Physische AI‑Adoption: Analysten fragten nach heutiger Relevanz und Umsatzanteil; Antwort: erste Umsätze vorhanden, größerer Mehrwert mittelfristig erwartet.
- Ramp & Engpässe: Nachfrage hoch; Management sieht Beschaffungspläne als vorbereitet, nennt aber mögliche Engpässe bei Aufbau von Kapazitäten und langfristiger Lieferkette.
- China & H200: Nachfrage aus China stark, aber Auslieferungen abhängig von individuellen US‑Lizenzen; Timing ungewiss.
⚡ Bottom Line
- Fazit: Call bestätigt technologisch dominante Produktroadmap und erhebliche Umsatz‑/Backlog‑Visibility. Positive Wachstumssignale durch Vera Rubin, Spectrum‑Netzwerk und neue IP, aber operative Risiken bleiben: Lieferketten, zeitliche Ramp‑Execution, und ausstehende China‑Lizenzen. Für Aktionäre: positiv für mittelfristiges DNA‑Wachstum, kurzfristig überwachen: Ramp‑Timing und Regulierungs‑/Supply‑Risiken.
NVIDIA — UBS Global Technology and AI Conference 2025
1. Question Answer
Good morning. We're going to get started here. I'm Tim Arcuri. I'm the semi and semi equipment analyst here at UBS, and we're very pleased to have NVIDIA. Pleased to have Colette Kress with us this morning.
Thank you. Great to be here.
So before we begin, I think you have to read a.
I do. Okay. As a reminder, this discussion may contain forward-looking statements, and investors are advised to read our reports filed with the SEC for information related to risks uncertainties facing our business.
Perfect. Well, so right now, Colette, there's basically 2 debates. One is whether there's an AI bubble; and two is the competition. So I wanted to address these one by one. So first, what is the market missing? When everyone talks about an AI bubble, what is the market missing versus what you see in your business?
Yes, it's a very interesting discussion as a lot of words really focusing on some very interesting thoughts regarding the supposed AI bubble. No, that's not what we see. What we see is 2 to 3 different major transitions happening in the market. And we've talked about these transitions in history that says first. Let's not forget the need to transition to accelerated computing most of all of workloads, most of all work done in the data center has been done with CPUs for years.
But what our focus is on is transitioning that to GPUs. It's a necessary thing because there's just not going to be any improvement that we can see in terms of the other means of using CPUs. So that's one of our first pieces. So when we think about our outlook of by the end of the decade, $3 trillion to $4 trillion worth of AI or just total data center infrastructure moving that together, probably about half of that is just focused in terms of the work in terms of working on that transition.
We're in the early parts of that. And what you're seeing, for example, is the hyperscalers, the very large CSPs as well. That is a very big part of the work that they are doing. So you are seeing them work in terms of revising search revising for recommender engines and revising for the overall social media. This is a very big part of what we are seeing today. There is also that transition that is going to be necessary for AI, including what you need for accelerated computing and focusing on AI and agentic AI is moving in that piece. But keep in mind that it is only one part of what we see today, and we're going to see that continue to grow through the rest of the decade as well.
So of the $3 trillion to $4 trillion that Jensen talked about 2020 -- by 2030 so that would include replacing all of the existing $1 trillion worth of data center infrastructure?
Absolutely can be. A lot of that is going to be necessary. How long will that take? But keep in mind, they're also growing. So it's not just thinking about the history, as they continue to grow, they are also going to have to add more and more accelerated computing into the work.
Got it. Let's talk about competition. So we haven't even seen a model yet trained on Blackwell. So everyone is a little up in arms about whether your competitive lead is shrinking or not. So maybe you can speak to that.
Yes. So let's just talk about where we stand. We're very excited in terms of our Grace Blackwell configurations that we put in the market. That's both the 200 series as well as the Ultra Series and the 300. Today, you're going to continue to see more and more models coming into the world. Those models are right now being built, and you're probably going to see them in about 6 months coming out in terms of the new models. What we did and when we created our Grace Blackwell configuration, that was an important change that we made in terms of completing full data center scale. We refer to that often in terms of rack scale.
But I think the important part to remember that, that, that was a focus in terms of extreme co-design that would be necessary, not with just 1 chip, but 7 different chips altogether working to create what is going to be very important for both accelerated computing and many of these new models that would be coming to market. So we're very pleased with that. But keep in mind, it's just not related or is not anything similar to what you may see in a fixed function ASIC. It's very, very, very different. So today, everybody is on our platform. All models are on our platform, both in the cloud as well as on-premise. All workloads all continue to be on our platform.
And from what you see from the performance of these chips, what you see going on with ASIC and what you see going on with how much you're doing with racks and integration and scale up. Do you feel like your lead is shrinking?
Absolutely not. I think our focus right now is helping all different model builders, but also helping so many of the enterprises with a full stack, a full stack that is incorporating not just that hardware. But remember, everybody needs that assistance to transform their software and our software platform with CUDA, the all the additional libraries are some of the best reasons why people continue to stay on our platform.
That platform is one that is usable for a significant amount of time and actually gets better over time. You've seen our continued improvement in using our software and enhancing our software can give you an X factor improvement in terms of what you've seen in anything else that we have done. So we're going to continue working with that and watching customers use that capability to continue new models on our platform, but also maintaining all the same infrastructure that they have on-prem already in terms of working their models as well.
And I get the question a lot about how much of what you're shipping is replacing existing GPUs versus just additive to the existing base. And it seems like almost all of what you're shipping is just additive to the base. We haven't even begun to replace the existing installed base. Is that correct?
It's true. It's true that most of the installed base still stays there. And what we are seeing is the advanced new models want to go to the latest generation because a lot of our codesign was working with the researchers of all of these companies to help understand what they're going to need for their next models. So that's the important part that they do. They move that model to the newest architecture and stay with the existing. So yes, to this date, most of what you're seeing is all brand new builds throughout the U.S. and across the world.
And I think Jensen mentioned on the call that there's still AI workloads being done on Ampere. And so can you talk about that? When I talk to some of these -- no clouds and I ask them, how much they're renting Ampere for when it comes off lease, they say it's for pretty much the same price. So obviously, demand even for the old instances is still pretty high.
It is. We still see Ampere. We certainly see hopper continuing to be used. That's very helpful for them in terms of their internal research that they do, the work that they are doing to fine-tune their models, they can, again, use it because you're backwards compatible, forward compatible from the software. So all of that continues to work in the work they're doing.
Can we just talk about the profitability of inference workloads? And I'm wondering if you can speak to how profitable these inference workloads are for your customers? And any anecdotes that can sort of help make the case for ROI. We always hear about ROI. And so any anecdotes that you can talk about, maybe as these Blackwell racks ship what the ROI is for these inference workloads for your customers?
You should think about also what we're seeing already with the workloads and what is driving right now, the advancement of more and more compute that we're seeing today. Why is that? So we had talked about earlier that reasoning models would be an important part of what the model builders were building. it was not enough to just say you have a single response, but thinking, long thinking, thinking about in terms of reasoning. It's a very big part of the models that were being built. So those are now coming to market, and you'll see more and more of them, again, on the Blackwell architecture.
So what does that drive? That drives upfront in terms of they need more compute. Those 3 scaling laws that we've talked about are still intact for all of the different model builders and communicate in terms of that. So they're building greater and greater models in terms of that reasoning. And then what happens is more and more token generation. That unique token generation also had another piece to it that says you have more users.
Now you have both token generation, you have more users. Users are also now working in terms of I think I could be buying that, I would absolutely pay for being able to do that. So now inference has moved not only to reasoning type of models, but now there's a margin that is actually being created that fuels, again, more compute and more models to do there. So you've got a flywheel happening already in terms of how inferencing in the token generation has occurred.
Yes. And all your customers on all their public calls, they all talk about if I had more compute, I could generate more revenue.
That's correct. More compute, more tokens.
Can we talk about this disconnect between some of the model builders don't have very much revenue, yet they are committing a lot of capacity to you into the supply chain and to some of the large hyperscalers who do have money to spend, but the model builders don't have a lot of money today. They have to raise the money. So how do you think about that as a risk to your business?
So let's first step back when we talked at the very beginning here, really understanding that those hyperscalers are continuing to buy compute for their internal use and/or the work that they are doing in terms of transitioning to accelerated computing. The model makers that are out there, you're right. They do need more compute. But just like all things, they're going to have to work through. Have I earned enough in terms of a profitability? Can I raise more capital? And can I take a look in terms of additional compute that I would need.
All of that is still in motion. I think it was helpful for all of the model makers to help us understand what is that vision, what are things in the future, giving us an understanding of the options that they have and we'll be here to support them. But again, right now, a lot of our work is on today and the next year and the year after that to make sure that the right amount of compute capacity and capital is available for what they need today for those models.
So it's more of a longer-term aspect in terms of that piece. But again, our focus in terms of demand and supply, our supply and our demand is based on do we have POs, do they have the ability to pay in terms of the capital. That is -- nothing has changed in that perspective.
Great. On that -- just in that vein, can we talk about your partnership with OpenAI -- you did announce this big LOI, 10 gigawatts, which, by our math, is somewhere in the range of $400 billion over the life of that deal. How much of that 10 gigawatts is actually locked in? I imagine maybe there's a gigawatt that you're planning to ship next year. But the agreement is more of an LOI framework agreement and you're allowed to invest along the way. So can you talk about that?
Yes. So OpenAI and our agreement with them a very strong partnership, a partnership for more than a decade. And they're preferred partner of their availability for compute needs that they have. But keep in mind, today, and our focus, for example, on our $0.5 trillion worth of Blackwell and Vera Rubin is really based for OpenAI continuation of the CSPs who are helping them with the compute that they would need. So right now, that $0.5 trillion doesn't include any of the work that we're doing right now on the next part of the agreement with OpenAI.
So we believe we'll continue working with OpenAI. Yes, we still haven't completed a definitive agreement, but we're working with them. Their desire, which is focusing on how can I work directly? How can I work directly with NVIDIA in terms of how we build out our compute structure. So that's going to be in the future, and we're right now continuing that work to understand how we can help them through that. But keep in mind, right now, most of it -- all of it right now is just with the CSPs in terms of what we baked in.
So that slide you showed doesn't include anything that would be part of this framework agreement.
That's correct. That's correct.
And this framework agreement would be all direct to OpenAI.
That's the plan. They do want to go direct. But again, we're still working on the definitive agreement.
Great. Let's talk about your exposure to OpenAI and your anthropic partnership. How would you contextualize your overall exposure to OpenAI? And then maybe how significant is your partnership with Anthropic?
We're excited about our partnership with Anthropic. Anthropic needing help in terms of more and more compute and very focused in terms of also on our platform. So we are going to help them. This is again a situation through a CSP and working with Microsoft. That's been a big part. Not only are they interested in terms of now for the CSP, they're also looking in terms of 1 gigawatt in terms of the future to do that.
So now when we think at all of those model makers, we've got all of them focused on our platform and working with us in there. So it's a great position. In the case of OpenAI, OpenAI continues down their path of what they need. And I do believe our work with them will never end in terms of engineering to engineering focus as well as we've been assisting and working with our engineers to do so.
So when you saw them making all these commitments, I think those announcements came out over a couple of week period, maybe it was a month. Did that make you concerned at all about your direct and indirect exposure to them? There was just such a flurry of announcements.
No. I mean, they are an indirect customer through the CSPs, but all of the model makers -- most of them are also indirect in terms of their. So we still stand that our CSPs are approximately about 50% or more of our revenue each and every quarter and has been for quite some time. Now their work in terms of helping model makers us and indirectly, we support that is a fine process. All of the capital needs is helped being fueled by using in terms of the CSPs.
Great. Let's talk about Vera Rubin for a moment. So the transition to Blackwell Ultra has been very smooth. Can you give us a sneak peek on Vera Rubin and what this ramp could look like and the potential leap in performance we could see relative to Ultra?
Yes. So Vera Rubin, we're pleased to say that it has been taped out. We have the chips and are working feverishly right now to get ready for the second half of next year. to bring that to market. We're very pleased both with what occurred with Ultra. People come in and say, it was seamless, and that's what we wanted. So a seamless transition, very, very helpful for many in the new models that they were creating. And I think you're going to see an X factor increase also with performance as we think about Vera Rubin. So it's right around the corner for the second half of next year. We're very excited for it.
So there was a point at which Jensen, I think, said even if any competitor offered their product for free, nobody would buy it. And obviously, some people are buying from some of your competitors. They're nowhere near the scale that you're at. What's changed? Would you just say that it's -- well, look, the market is growing so much that obviously, they want to just hedge their risk?
You have to look at kind of a statement that just says the performance and the overall use of NVIDIA's ability to create full systems that can accomplish any type of workload, any place any type of model is quite unique and what has been designed. The concept that you have fixed function type of product would be able to do something similar to that leads them down the path that just says, you can take it for free and you may not benefit from that. And that's what he sees, and that's what we all see in those. So it's very important as they think through not just what they need to do to build the model they have the ability to not only train in terms of the model, but complete a full inferencing, all on the same type of architecture.
Each part of that being designed to do so, being able to scale at many different aspects and scaling up has been extremely important and NVLink is important to do that all for the model making that is happening right now. So once you move from that training and you want to go into the inferencing, again, you talk about a full system that has been engineered for inferencing with all of the different focus in terms of the networking that is going to be necessary to make sure traffic and otherwise is happening quite well.
Now we see that today, and it's not just about day 1, but that is about your full time using that from an inferencing. Remember, power efficiency is also extremely, extremely important from the inferencing standpoint. So co-designing everything that we did there, it's very hard to think about a very simple chip, fixed function chip would be able to do that. And that's why many stay on that platform doing both. You have the capabilities to do all those different pieces together.
Let's just talk for a second about CPX, I get a lot of questions on this. And I still think that people don't really get how important CPX is. And this is the first time where you're taking a workload and you're breaking it up. It's not an ASIC per se, but it's an approach -- it's an ASIC-like approach to a workload. So can you talk about CPX and whether there are more workloads, how ubiquitous that approach could be?
There is a need for breaking down, whether it be the training or breaking down the inferencing through there. You're going to have many different types of inferencing requests and need. So CPX takes you to a different stage within the same infrastructure to get that done. The concept that you would have multiple different infrastructure is working at the same time to accomplish that. This is everything that you would think about in terms of the world of a mixture of experts. This is the key piece right now that is very important in terms of the model builders that are right now. All of it is based on how did you design your work in terms of with those experts.
And that takes some very important amount of compute that is able to complete not only a full model, but each one of the experts. So they're breaking that down, but not necessarily break it down that says you can use a different type of compute to do that. Staying full on that full system is probably the most efficient way to get that done.
Let's talk about software for a moment. There's this argument that some people have that because you can program now in AI that somehow AI will itself break down your moat in CUDA that it will allow somebody to build a platform faster that could approximate what CUDA does. How would you respond to that?
CUDA is a very long-standing and important development platform that has been with us for several generations, and we're probably on our 13th version of it. The important part of that is not only CUDA, the development platform, but the consistent libraries for all different types of industries, all different types of workloads. The way you want to think about those libraries [ marked ], you're usually generally providing them at least the first 100 lines of code or help that you can go and do.
So starting with that, an important group is going to be the enterprises. It is going to be in terms of those that are building for themselves and have some ability to do that. Not everyone is going to be staffed with software engineers at scale in order to do that software. And it's a very, very important part. For many years, people have talked of, hey, we can do something very similar to CUDA or we can take the key piece of it to CUDA.
It hasn't been very successful because when you think about AI and how fast it's moving, we are keeping updates all the time in terms of new techniques, new things that they are working on. And it just is always going to be behind if anybody really thinks that that's an easy thing to do. These have been designed working with all of our different GPUs, not just one GPU, but all of it is backwards compatible and forwards compatible. It's one of the best features that we have. You buy the compute and it will probably get stronger, more performance as we continue to improve the software over that period of time. We've done that with Hopper and you're also starting to see it with Blackwell. That helps that continuation on why they use our compute for a long period of time, keeps getting better as they own it.
So is there a metric that if you bought A100 and you're still using A100. With all the CUDA updates, how much have you been able to improve the performance of A100 with these CUDA updates?
So each one of them has different capabilities in terms of helping them each one of whether it be Ampere, whether it be Hopper 100, Hopper 200 [indiscernible] X factor improvement, even if you think right now with Blackwell. Blackwell right now, you've got a total increase from the last generation of 10 to 15x. And within that, you probably have a 2x just from the software right now after we've gone to market with it. It's a big increase improvement.
Great. Can we talk about margins? You've done a great job this year. You committed to being in the mid-70s. You made that commitment early this year and you've reached that. Some people are worried that because of the price of HBM and because of the HBM content and because of the cost escalations and just in your [indiscernible] that you won't be able to hold those margins you sound pretty convinced that you can hold mid-70s next year as Rubin ramps. So can you just talk about how you plan for this?
Yes. Always when you complete what you said you're going to do, they're always going to ask in terms of what's next. So we knew that would be very important to it. But we're very pleased with the work that the team did in terms of really fine-tuning both our cycle times, our yields, our costs, all of that to move to -- into the mid-70s. We think that's a very great number if you think about what we've accomplished in over a very short period of time. So what is with that is seeing right now that the Blackwell Ultra version was quite seamless, which, again, allowed us focusing more and more in terms of cycle time and work that we could do to do that.
We are aware in terms of supply, the prices in terms of supply. It's important. Those are very important parts of our business to do there. But if we think through just the scale of what we're doing and what we can do with just even one more day of efficiency of cycle time and our focus in terms of how to use our cost, the best way to do that manufacturing. We believe as we move into next year, that will also stay within about the mid-70s.
Great. One thing that I thought was really notable from this last earnings report was if you take your inventory increase, combined with your increase in purchase commitments, it had been going up a couple of billion dollars each quarter, and it went up $25 billion this time, a massive increase. So obviously, that portends significant revenue growth over the next 2 to 3 to 4 quarters. So can you talk about that? Can you talk about the purchase commitment side of it? Because people look at the inventory and they say, well, the inventory went up so much that's bad. I don't see why that's bad. That's good if you think you're going to grow so much and then you take the purchase commitments also and it went up a lot.
Yes. So let's focus on inventory and purchase commitments. You're right. Those are good things, but those are growing. That means we do have supply for what we think in terms of the future is of our demand that we have. We have ordered our supply. But let's break that down a little bit. The inventory is a place in terms of where we are building things that will be processed and likely go-to-market within the current quarter, okay? What you saw probably in the inventory and where we stand now at the beginning of December, probably all of that has moved in terms of has already been shipped to our customers.
So the next piece of that, though, is to therefore, look in terms of where we stand in terms of purchase commitments. If we have talked about in terms of our growth, our growth that we see in that $0.5 trillion by the end of next year, we have to be ordering very, very important amounts of supply. What has changed over time, keep in mind the complexity of our systems and what we have to do to put those together, whether those be components the 7 chips, there's a lot that needs to be ordered to make sure that we have.
Long lead time items are also key that we want to make sure that we are behind. So what's interesting about it is your purchase commitments, your inventory is important, but let's remember, supply and demand and managing that it's a day-to-day type of thing. If things change, you may need more, and you're always, always, always working the supply. But yes, you should look at that as -- it was a good thing. It was a good thing we're growing.
I wanted to ask about this famous slide that you showed at GTC now. And we all see this $500 billion number between calendar '25 and calendar '26, and we all try to back into what that means for calendar '26, but what that also doesn't say is that there are deals you're signing, if you can still do something within lead time, say, for the Anthropic partnership, for example, that would sit on top of that number. Is that correct?
That's correct. We had talked about that probably unique for us to actually at our GTC DC to discuss in terms of what we saw going into the next year, but it's important to understand the planning that all of these companies need to do from a capital capacity perspective as also compute. We felt it was important to understand there is a lot of growth still planned as we even move into the beginning of this next year as well. But you are correct. There is also an opportunity for that to increase more. I talked about in terms of the additional things that we saw in the Middle East in terms of focus and you may even hear about another one today as well.
Great. I mean that slide to me suggests you're going to do between $350 billion and $400 billion next year in revenue. So you're going to generate tons of cash. So the next question is capital allocation, and this is probably the last question we had time for. But how do you think about when you're making these strategic investments and you're allocating all this cash? I know you have to make these purchase commitments, so you have to keep a lot of cash on hand for that. But how do you think about capital management, given all of the cash that you're generating?
A really important question, and I want to make sure everybody understands one of our largest focus is making sure that we have the cash available for our internal needs. And that is a lot in terms of the supply and the capacity that's going to be necessary to build what we're building. Today, as you know, the engineers are off working in terms of Vera Rubin and bringing that to into market as soon as possible. But that means we need that capital just to run our business based on just the size of that growth. So that's always going to be, one, is back into the business in terms of what we need to do.
The second piece also a focus of ours is focusing in terms of shareholder return, okay? What can we do in terms of stock repurchases and our dividends. So those will always be a part of what we do within there, which then leads that last part within our free cash flow, what can we do in terms of strategic investments? Today, our strategic investments are focusing on the ecosystem and expanding that ecosystem. We have long-standing partners that we have been working with for years in terms of that. But the market is growing and there's opportunities within the ecosystem to assist and invest and learn from their work that they are doing because it will be an important part for AI going forward.
Now keep in mind, those investments that we invest in them are small. They still -- for a lot of their work that they need to purchase the capital and do that. It's probably still 90% or more in terms of what they need to do to raise that capital. But our goal is to help understand what's going to be possible in the future with those different types of ecosystem investments that we do.
Yes. It does seem like you're pivoting a little more toward ecosystem investment versus M&A. Is that fair?
I would say we do both. It's hard to think about very, very significant large types of M&A. I wish one would come available, but it's not going to be very easy to do so. So we do focus on M&A. We focus in terms of engineering teams that can be helpful in terms of our platform and work. So from time to time, we do have those.
Perfect. Well, we've run out of time. Thank you, Colette.
Thank you. Thank you so much.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — UBS Global Technology and AI Conference 2025
NVIDIA — UBS Global Technology and AI Conference 2025
📣 Kernbotschaft
- Kernaussage: NVIDIA sieht kein AI‑Bubble, sondern mehrere langfristige Übergänge hin zu beschleunigtem Computing; GPUs und rack‑skaliertes Co‑Design treiben ein geschätztes $3–4 Bio. (bis 2030) Marktvolumen. Die Kombination aus Hardware, Systemintegration und CUDA‑Software soll die Plattform‑Bindung sichern.
🎯 Strategische Highlights
- Blackwell/Grace: Grace Blackwell Konfigurationen (200/Ultra/300) als Rack‑Scale‑Ansatz mit sieben Co‑designten Chips; Ziel: komplette Rechenzentersysteme.
- Vera Rubin: Chip tape‑out abgeschlossen; Ramp geplant für zweite Jahreshälfte 2027 (Management: deutlich höhere Performance gegenüber Ultra).
- Ökosystem: Partnerschaften (OpenAI LOI, Anthropic via CSPs) plus stärker steigende Inventar‑ und Purchase‑Commitments zur Absicherung der Lieferkette.
🔎 Neue Informationen
- Konkretes: Vera Rubin ist „taped out“; Management erwartet Marktstart H2 2027. OpenAI‑LOI (10 GW) bleibt Rahmenvereinbarung ohne definitives Closing; die präsentierten $0,5 Bio. Planzahlen schließen dieses LOI‑Volumen nicht ein.
❓ Fragen der Analysten
- Konkurrenz: Wurde gefragt, ob Führungsposition schrumpft — Management verweist auf Full‑Stack, CUDA‑Moat und Rückwärts/ Vorwärts‑Kompatibilität; Lead werde nicht kleiner.
- Inference‑ROI: Analysten fragten nach Profitabilität der Inferenz; Antwort: Token‑getriebene Nachfrage und neue reasoning/agentic‑Workloads schaffen ein margenstarkes Flywheel.
- Margins & Supply: Fragen zu mid‑70s Marge und $25 Mrd. Inventaranstieg; Management sagt, Optimierung von Zykluszeit, Yield und Einkauf sichert Margen und bereitet Kapazität vor.
⚡ Bottom Line
- Fazit: Klarer Fokus auf Plattform‑Moat, Systemintegration und Software; Vera Rubin und weiteres Blackwell‑Momentum sind potenzielle Kurstreiber, OpenAI‑Deal bleibt aber unsicher. Hohe Purchase‑Commitments deuten auf starken, in den nächsten Quartalen zu realisierenden Umsatz‑Ramp hin.
NVIDIA — Special Call - NVIDIA Corporation
1. Management Discussion
Good morning, everyone. I'm Mylene Mangalindan with NVIDIA Corporate Communications. Thank you for joining us to discuss the press release we issued today regarding a strategic partnership between NVIDIA and Synopsys to revolutionize engineering and design.
With me on the call today are Jensen Huang, Founder and CEO of NVIDIA, and Sassine Ghazi, President and CEO of Synopsys. [Operator Instructions]. As a reminder, this call is being recorded. The content of today's call is NVIDIA's property. It can't be reproduced or transcribed without prior written consent. During this call, NVIDIA synopsis may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties. For a discussion of factors that could affect their business, please refer to the disclosure in NVIDIA and Synopsys' most recent Forms 10-K and 10-Q and the reports that they may file on Form 8-K with the Securities and Exchange Commission.
With that, let me turn the call over to Sassine.
Good morning, everyone. Jensen, it's great to be here with you on the special day as we're announcing the expansion of our relationship. And I know our relationship has been with NVIDIA for decades.
Since the founding of our company.
Since the founding of NVIDIA. And what I'm most excited about here today is truly revolutionizing how engineering is done across multiple industries. At its core, what we're announcing today is bringing together synopsis engineering, software and domain expertise with NVIDIA's accelerated computing and AI technology to transform how engineering is done. I often refer to it how to reengineer engineering in this era of pervasive intelligence. NVIDIA accelerated computing sparked an AI revolution. Today, most of our experience with AI is through software on screen.
As AI expand into the physical world or physical AI the engineering complexity of designing such systems is massive because you are dealing with multiple engineering domains that they need to come together at the system level in order to make sure it's right the first time. We're talking about electronics, electrical, mechanical structure, thermal, connecting to a bunch of sensors, leading the physical world and be able to prototype design, simulate and make sure you're doing it in a cost-effective way on time. So that complexity of dealing with the system design will not happen, and it cannot happen in a practical way, without accelerating at all level of the stack at the computation level to GPU acceleration, where we'll be able to achieve factors in terms of acceleration. As well as any AI capability to change the workflow all the way up to the system level modeling with the digital twin and what is possible to create and prototype these systems virtually before you create a physical prototype.
So combining what NVIDIA is bringing with what Synopsys offers and our leadership position in EDA and IP, which has been essential to tame the complexity over the last number of decades at the semiconductor level to the system-level simulation and analysis. So the solution we're bringing as silicon to system engineering solution with an accelerated compute and the NVIDIA stack is something that we're very excited about. The other part that's often understated is the go-to-market and customer reach. With the ANSYS acquisition, we broadened our customer base to engineering teams across nearly every industry.
With that combined with the technology and that global network of thousands of direct sellers and channel partners to drive adoption, the opportunity is significant. To sum it up, we will integrate the strength of Synopsys unmatched engineering solutions with NVIDIA's leadership in accelerated computing and AI to help R&D team design, simulate and verify these intelligent products with greater precision, speed at lower cost. Together, we aim to unlock new market opportunities.
Let's roll a quick video and then over to Jensen.
[Presentation]
That's incredible, pretty amazing to start, the incredible work that we do together. Thank you, Sassine, it's great to be with you and to announce our partnership. We're in a major inflection point in computing for design and engineering. This is one of the most compute-intensive industries in the world. In fact, EDA was the killer app of the worst station industry. It drove the generation of computing before us. And for the last 40 years, it has supported an enormous industry of general-purpose computing and CPUs. We, as you know, NVIDIA Pioneer CUDA accelerate in AI computing that is now revolutionizing every industry, and we're excited to announce that we're accelerating that for EDA system design automation, computer-aided engineering, many of the things that you saw just a second ago, and of course, computer-aided drug discovery, the next frontier.
Our multiyear partnership spans NVIDIA CUDA acceleration, Agentic and physical AI and Omniverse digital twins. These are all the things that I've been working on for coming up on a decade. And finally, we've reached a level of maturity and capability that we are able to revolutionize the entire design and engineering industry. This slide, the order of magnitude speed up has unlocked the opportunity to do physically accurate digital twin simulations at a scale never before possible. The speed up of NVIDIA's GPU accelerated computing has made it possible over the last decade to shift the way that scientific computing is done in 2016, the mix of CPU to GPU computing in a supercomputing data center for scientific simulation, physical simulations, which is also the foundation of the work that's being done here in the EDA and the SDA and CAE industry.
In 2016, it was 90% CPUs and 10% GPU accelerated. This year, that entire mix has flipped. Over the course of the last decade, we've now shifted it to 90% accelerated computing and 10% general-purpose computing. The same shift is going to happen in this industry. The order of magnitude speed up is going to unlock opportunities that have never been possible before. The performance gains are remarkable. You can just take a look at some of the sampling here from computational lithography, to logic simulation, the circuit simulation, fluid dynamics and AI physics using AI to emulate first principle physics simulations. The performance gains are remarkable. It's across core engineering workloads that I just mentioned, and we're seeing speed up from 10x to over 1,000x.
Basically, what this means is something that would take weeks could now happen in hours. But it's also very important that we can now scale the simulation from off-line to real time or just from a part of a system to the entire system or from just a little tiny part of a simulation maybe to an entire factory or a city. And we're extending this acceleration next year to play in real, structural mechanics, electromagnetics and thermal simulation. Basically, as we think about where we are in the journey of doing product design from designing the silicon to the systems to the system of systems, we're in the future going to do all of this inside the computer.
With accelerated computing, we can now essentially have a digital twin of the final product that we want to build, all living inside the computer so that we could explore the design space and make the product as perfect as we can before we even make the first version of its physical embodiment. All of this runs everywhere across major clouds providers and OEM systems. This is one of the unique properties of NVIDIA's installed base. We're in every single cloud. We're in every single computer company. We're available on-prem at the edge or supercomputing centers, just about everywhere there is computing, we can now run NVIDIA as a result, just because of our installed base and our reach, Synopsys can also run everywhere. Engineering teams of any size anywhere in the world can now have the benefit of accelerating NVIDIA's accelerated computing and AI computing from day 1.
Our partnership -- our partnership will open new market opportunities for both of our companies. As I mentioned earlier, this is one of the largest compute-intensive industries in the world. has not been serviced not been addressed by accelerated computing until now, and we're accelerating that process with our deep partnership. Together, we will address nearly every industry where scientists are inventing new technologies engineers are creating new products and factories that make them. Synopsys is expanding our opportunity from chips to nearly every industry. NVIDIA's computing -- NVIDIA's computing has a new domain of applications to accelerate. And so we're super excited about that. Sassine, this is a great partnership. We've been partners for, well, all 33 years of NVIDIA's life and have many stories to tell when we have time for everybody.
But this is an exciting moment for both of our companies. We're revolutionizing the entire product design and engineering world, it's a huge expansion of market opportunities for Synopsys. It's a huge expansion opportunity for NVIDIA. For both of us, this is another exclusive relationship and the reason for that is because, of course, everything that we do here is incredibly exciting for both of our companies, but it's really going to revolutionize the entire space, and it's going to be a giant growth opportunities for all of us and all of our partners.
Exactly.
Should we take some questions?
Thank you. That concludes our prepared remarks. We'll now open the call for questions. Your first question is from Tae Kim. [Operator Instructions].
2. Question Answer
Good to see you guys again. So coding is the first vertical, I think, where we're seeing tangible, huge speed-ups and this mass productivity increase. Cursor talks about this University of Chicago study where all of our clients are seeing 40% gains in productivity. I saw that you named Checked auto, industrial and aerospace R&D. When can we expect to see those kind of like big step up productivity gains using the AI and GPUs in the R&D of these other verticals.
NVIDIA is a huge customer of Cursor. 100% of NVIDIA software engineers, chip design engineers, every single engineer now is augmented by AI. Now of course, Cursor is generative AI with text. The work that we're doing here has to abate the laws of physics. Text is hard, of course, learning how to -- learning -- teaching an AI how to program a computer basically communicate with a computer telling you what to do is one thing, but creating software that accelerates physics and have it be physically based and physically accurate, so that we could design products spanning from chips to systems, the system of systems, all the way out to factories and robotics, that requires a whole new level of computation.
And so this is really about the intersection between computing and the physical world. This is, if you will, this is much more akin to scientific computing, physics simulation, robotics. And that field of AI is quite new. And we've been working on this for coming up on a decade. And so the combination of the work that you've seen me do over the last decade from CUDA, and all of the software stacks associated with that, for example, cuLitho and cuDSS and cuFFT and cuBlas and all of these types of libraries that sit on top of CUDA to NVIDIA's physical AI pioneering work to Omniverse our digital twin platform. All of these libraries are now going to be integrated into Synopsys, our partnership. And what's really exciting is the sand has pivoted and marshaled resources across the entirety of Synopsys to go after this opportunity, and we're so excited and so pleased by the partnership.
We're going to make a $2 billion investment in to Synopsys. But this -- overall, this is going to be a gigantic growth opportunity for the industry.
Yes. And maybe to add a little bit more color. We started redesigning some of our products about 7 years ago on NVIDIA GPU using the CUDA layer and in a number of cases, we've seen a significant speed up. When you talk about 10x, 15x, 20x speed up on a work that may take 2, 3 weeks that you can bring it down to hours customers will adopt it because the bottleneck of designing these complex chips or complex systems is limited by your ability to verify and the computation requirement to verify these systems because the worst thing you can do is assume that you're ready to go and launch a product and it doesn't work as intended. Because that cost hundreds of millions of dollars.
So you spend a lot of energy in the design and simulation phase. So we have a number of products already in use at customers is still very early stages in terms of a broad adoption, but that's why we refer to an expanded opportunity for both companies.
Thank you. Your next question comes from [indiscernible]. Ian, feel free to ask your question.
Jensen, you're often talking about shrinking that go-to-market time across industries for other companies using EDA multiphysics -- sorry, to compute. But in this partnership, is there something materially new with this enhanced collaboration that changes how fast NVIDIA itself can bring silicon and platforms to market beyond what you already do today? And for Sassine, is Jensen's answer going to be universal?
First of all, GPU adoption in the world of engineering and engineering is quite low. The world today has hundreds of millions of CPU cores or tens of millions of CPU systems, general purpose computing systems running EDA tools. We do that here at NVIDIA. And so in fact, our first supercomputer was not for AI is for running EDA so that we could design our chips perfectly so that we could have the speed of innovation so that we can waste as little money as possible when we have to do -- if you have to do redesigns. The best way to do things cost effectively is to do it perfectly the first time.
And so this industry has been growing with Moore's Law for 40 years. Finally, as you know, Moore's Law has really reached its limit and we need to give it a new computing -- new way of doing computing and this is where NVIDIA comes in. What's really exciting about this partnership is that it's broad and deep between us and Synopsys between using CUDA to accelerate the software, physical AI to emulate with AI to expand its speed and scale and also connected into digital twins on Omniverse. It's broad and it's deep in scale, and we're going to accelerate the time to market of -- we have pretty significant teams assigned to each other to accelerate all of these software tools to create these new products that Synopsys can take to market.
The time is really now, and I think that the expansion of the market opportunity goes from Synopsys and the EDA industry, addressing a several hundred billion dollar chip industry to now addressing a multitrillion dollar every product industry. And so in the future, every product will be designed in digital twins.
And to answer your second part of the question, any company with engineering R&D that is designing the next system, the next sophisticated intelligence system, you need the software stack that we deliver. And in order for them to make it effectively deliver it on time, cost, et cetera, and tame that complexity, they will be a target customer that they will welcome that speed up and ability to design those systems. So it's not unique only to NVIDIA what we were building. But every company that is building either silicon or system will welcome that speed up and the solution we're collaborating on.
Yes, this partnership essentially enables this industry to address the entire R&D budget of the whole world's GDP. And that's a pretty big deal. Everything that's going to get designed and built will be done first in a digital twin. I said earlier that in 2016, 90% of the world's scientific supercomputers running physical simulations and biological simulations and such was 90% general purpose on CPUs.
Today, it has flipped completely CPU-only general purpose computing and supercomputers is only 10%. NVIDIA and NVIDIA accelerated computing is now 90% of the world's physical science simulation computers. This is going to happen also to the EDA industry. In addition to that, of course, the expansion of the TAM expansion and market opportunity because of the work that we're doing.
Your next question comes from [ Stephen Mellis ].
I have one complicated question and one simple question. The complicated question is, in many of these physical simulations and things like engineering or critical components and aircraft and whatnot, there does still need to be a full double precision sort of simulation done at some point. And so how does this address that bottleneck of still having to do that at some point, even if you can do more iterations on the design first, but you still have to verify at the very end of double precision.
So that one's for Jensen how much of a bottleneck is that still? And then the simple one for us Sassine is how much of this $2 billion is going to go toward purchase of GPUs or GPU cloud computing services to get your software ready to do all this?
Simulation and the evolution of simulation to co-stimulation with emulation is multi resolution. It's no longer just FP64, of course, all of our chips support FP64. And we support FP64, we support FP32, FP16, and all of the Tensor processing configurations that sits at the intersection of all that. And so NVIDIA architecture is incredibly good at this. This is a fundamental difference between what NVIDIA makes and what an ASIC is. We can address the world of simulation exactly as you're pointing out. We can address the world of simulation and we can address the world of AI emulation and everything in between co-stimulation in between.
And so for industries that are fundamentally based on physical physically based and mission-critical applications. This capability is really important. And so as we address the application space completely. The challenge, of course, is to reformulate the algorithms, the simulation algorithms so that it could be accelerated on CUDA. And that is a multiyear journey. It took Sassine, some 7 years probably to do cuLitho. cuDSS took several years to do. cuFFT, took several years to do and now integrating it and reformulating Synopsys' applications to take advantage of this acceleration is what this is all about. And so we -- we have some 20 applications now that are CUDA accelerated.
And all of it will be CUDA accelerated and also AI physics infused and accelerate it over time. And so we've got a lot of work to do, but that's what this partnership is really about is about focusing the two engineering teams deploying across pivoting resources across the entire companies so that we could take this capability to market as soon as possible.
Maybe to add on Jensen's point here before I answer your easy question. The -- this is not replacing accurate simulation because you're doing something at a higher level or virtualized with a digital twin. You still need both but accelerating something that is taking weeks or it's not even practical to do and make it happen through this acceleration is where our customers are looking for that opportunity.
Now for your second part of the question, the $2 billion investment is -- will provide Synopsys optionality. As you know, we have a very strong balance sheet. We are already a customer of NVIDIA in our data center, but there is no intention or commitments to use that $2 billion to purchase NVIDIA GPU. This is something that we do on a normal course of business. We've been doing it for many years now.
Sassine and Synopsys is making such a large commitment in this partnership. We thought we would also make a large commitment in this partnership. And there's no purchasing relationship between the investment and anything else. Synopsys is already a customer of NVIDIA's. And in the future, of course, as we move into the world of accelerated computing and AI computing is a much larger customer of NVIDIA, but there's no relationship between the two sides of that.
Exactly.
Your next question comes from Nitin Dahad.
Okay. Nitin Dahad with EE Times. Just a follow-on from that last question. I didn't really understand or maybe what's the reason for an investment rather than just a straight partnership when both companies are investing resources and continuing to invest -- so where is that going into? Is it more engineering resources? I think you just said it's not extra GPUs. It's normal course of business. And then the second part of it is, is there any time line you talk about a multiyear partnership. You talk about bringing certain products to market. What are the key things that are coming out immediately from this partnership in terms of products that customers can use. I have lot of some other questions, but I'll do that later with you separately.
The investment is a demonstration of commitment and appreciation for Synopsys going all in on the NVIDIA platform. And not to mention, I think it's a great investment. We're revolutionizing EDA CAE, computer-aided drug discovery, basically all aspects of R&D, R&D for anybody who does product design and product innovation and product manufacturing. And so this is such a large expansion of the market opportunity along with the partnership, they're making such a large commitment on building on NVIDIA. This was a wonderful way for us to show our commitment in the partnership. But I recognize that none of this is exclusive.
Synopsys has a lot of chip partnerships that are going to continue to nurture and continue to advance NVIDIA has a lot of partnerships with Cadence and Siemens and to so that we're going to continue to nurture an advance. Each one of these partnerships are different, and this just felt natural to us. And we're delighted to do it, and I think it's going to be -- it's going to be a great investment for us.
So Nitin, on the white synopsis -- so you heard from Jensen the wise NVIDIA made the investment. From a synopsis point of view, the why did we take the investment is really about optionality and acceleration. We can do the work we're doing on our own. We've been doing it for 7 years on our own. So it's not like we're looking for a motivation to do it. It's going to become table stake. We know the market is going there. Can we run faster can we deliver faster to the road map. You mentioned as well by when will this technology come about. We have already a number of products that we have demonstrated and are in use with customers that are demonstrating that speed up.
But we're in early, early days. There are so many opportunities to truly change the way simulation is done, the various bottleneck of design, how do you accelerate it not only with the GPU CUDA layer, but with the AI workflow that it will change with genetic, not only the generative that we already have in the portfolio, but how it will pivot to change the way engineering is done. And at the system level, how do you virtualize the system to reduce the cost, improve the speed to go to market for our customer. So for us, it's all about acceleration and grabbing the opportunity that we strongly believe is going to happen.
When you look at all of the work that we've done to prepare our go-to-market at this point, 20-some-odd applications have been accelerated. But where are they going to run? And notice, we showed you two systems, two incredible systems that are going to accelerate all of these tools. And one of them is Blackwell. The other is Blackwell RTX Pro. One of them is optimized for the highest possible speed in all the simulation and all the MULAAI physics emulation or one that's designed to do all of that in addition to Omniverse. And these two architectures are now available from all of the world's OEMs, these two architectures available in all the world's clouds.
And now we have the ability to accelerate applications for anybody wherever they like to be able to do it. So everything from the partnership, the deployment of resources across all these different domains of applications and tools to the go-to-market, preparing the OEMs, preparing the clouds, all of that is now ready. The inflection -- this is a very big moment for the industry and now the races on to move the world from general purpose computing to adding on top of it accelerated computing.
Your next question comes from Matt Hamblen.
Hi, everybody. Thank you very much. I still don't understand what multiyear means. I mean three years makes sense, but 10 years seems like unrealistic.
So we showed a road map that by 2026, we are targeting a number of areas that today I call them bottleneck in the design. What does that mean? Where you have a shot point driven by the time it takes to do a task. So those are areas that we have committed R&D teams, prioritization on accelerating our product and workload for NVIDIA GPU by 2026.
I think that makes a lot of sense. I mean, basically, the race is on now. And NVIDIA has partnered with design tool companies across the industry for some time. But really, this is kind of, if you will, the inflection point that everybody is now have to raise towards over the next couple of years, I think we're going to see that the industry shifts from just general purpose computing to accelerated computing. And the ability to scale up simulations and order of magnitude to be able to scale up the speed of simulation by orders of magnitude. That day has come. I think every single engineering organization over the next couple of years are going to enjoy the benefits of the work that we're doing here and the platform shift that's happening.
I don't think it's going to take 10 years. I said earlier that scientific computing, which moves relatively slowly actually, in the course of the last 10 years, went from 90% to 10% and general-purpose computing to now 90% accelerated computing. And so that shift took 10 years. This is the industrial space. People here. Their livelihood depends on it. These tools are mission critical. Time to market is mission-critical. Competitiveness is mission-critical. I think we're going to see a platform shift in this really quite gigantic computing industry over the next couple of 2, 3 years. And so almost every engineering organization will consider acceleration -- accelerated computing starting tomorrow morning.
Yes. The most difficult part is doing the work, meaning designing to get to the acceleration. Once the acceleration is there, getting the customer to adopt is something I'm less worried about because the bottleneck, the need for that speed up is there. customers are often looking for new methods to get the work done faster and still with high level of accuracy. The 2026 reference I gave is the commitment from our Synopsys R&D to prioritize these key technology to be accelerated on NVIDIA GPU as well as demonstrating that acceleration with the customers and then the adoption will happen.
Your next question comes from [ Christina Personalopolis ].
Sorry about that. Just two questions. The first question just has to do with regulatory concerns. So Jensen, if you're investing in Intel, anthropic, the list continuous [indiscernible]. Are you concerned that on a $2 billion investment in Synopsys would start to raise some eyebrows. And then the second question is, let's say, I'm an AMD engineer. Does this mean that your tools now will be optimized for NVIDIA and it makes it more difficult for competitors to be utilizing them.
The reason why we're investing in our ecosystem is because we're going through a platform shift from general purpose computing to accelerated computing and AI computing. And so it's sensible that when we're building and you know that our platform consists of CUDA and all the CUDA-X libraries and Omniverse and AI, both Agentic as well as physical AI. That -- those libraries, that software. NVIDIA is, in a lot of ways, a software company that builds great chips. They -- everybody thinks of us as a chip company, but in fact, what really gets integrated into companies like Synopsys are all of the libraries that we design and created. And so when we are able to invest in key parts of the ecosystem, we accelerate the entire ecosystem.
And so that the investment makes perfect sense for us. The partnership is nonexclusive. There's no obligations whatsoever for Synopsys to only buy NVIDIA. And they're welcome to continue to work with their rich ecosystem of chip partners, and we're going to continue to work with our ecosystem of really important EDA and SDA and CIE industry partners like Cadence and Siemens and Dassault. And so with respect to the tools that we all buy, as you know, NVIDIA uses a lot of x86. We partner with Intel, we partner with AMD, and we buy lots and lots of CPUs. All of NVIDIA's EDA , the way we do chip design, the way we do system engineering today, still largely based on x86 CPUs. This is really the beginning of that platform shift so that in the future, that's going to be augmented and accelerated by NVIDIA GPUs. And I'd be delighted for all of the chip industry to be buy NVIDIA GPUs for their designing their chips and just as I buy their chips to design our chips.
Yes. And Christina, to be clear, today, Synopsys pretty much entire portfolio is x86-based. When a number of customers started using, say, ARM, we ported our software to ARM-based architecture. Hyperscalers investing in their own compute report our software to deliver to their own compute. So we're -- we follow the customer requirements and needs. What's unique about the partnership here with NVIDIA is we have a partner that is investing in the CUDA layer, not only the architecture of the compute and is aiming at the market of engineering computation and speeding up the solution because that requires an investment from both sides.
This is not making our software available on an AMD or an Intel architecture or arm, et cetera, because it's already available. But can you take the partnership to the next level of acceleration and value to the customers. And that requires investments from both sides. So if an AMD or an Intel or whichever customer wanting to capture a similar opportunity, it's not exclusive. We're willing and happy to work with them. So that's what's unique about what we're talking about here today.
That concludes our Q&A.
Well, one of the things that -- one of the things I will say is that of all of the AI opportunities, industrial AI, physical AI is the largest of all. And the reason for that is very clear. The world's industries represents the vast majority of $100 trillion industry. Today, that industry, whether you're designing cars or trains or planes or designing computers. All of that largely is based on general-purpose computing. And we know that, that journey, which has taken us the last 40 years has been incredible. And Moore's Law has enabled us to reach wherever we -- the incredible condition that we are in today, but in order for us to go even further in order for us to do even more expanding the reach of the design and engineering so that we could do almost everything in the world inside a digital environment long before we create the physical manifestation, that journey we've been preparing for several years now.
And today, our announcement really kicks into turbocharge. And so this is a huge opportunity for NVIDIA. A huge opportunity for Synopsys. I'm grateful for our partnership over the 33 years, frankly, since the very first day of our company. And in the first day of our company, Synopsys enabled NVIDIA to design our chips. Now our partnership is going to enable everyone to design everything that's physically manifested in the future. And so thank you for your partnership. I'm very excited about this, and I'm looking forward to incredible returns on my investment.
You get it. Thank you, Jensen. Thank you.
Thank you, Sassine.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Special Call - NVIDIA Corporation
NVIDIA — Special Call - NVIDIA Corporation
🎯 Kernbotschaft
- Deal: NVIDIA und Synopsys kündigen eine mehrjährige strategische Partnerschaft samt einer angekündigten NVIDIA-Investition von 2 Mrd. USD zur Beschleunigung von Engineering‑Workflows an.
- Ziel: GPU‑Beschleunigung, KI‑Integration und Omniverse‑Digital‑Twins sollen Planung, Simulation und Verifikation von Produkten von Silizium bis Systemen massiv beschleunigen.
⚡ Strategische Highlights
- Technologie: Integration von NVIDIA CUDA (NVIDIAs GPU‑Softwareplattform), physischer KI (Physics‑AI) und Omniverse (Digital‑Twin‑Plattform) in Synopsys‑Werkzeuge.
- Markt: Ausbau vom klassischen Electronic Design Automation (EDA) hin zu System‑Level‑Simulationen in Branchen wie Automotive, Luftfahrt, Industrie und Pharma.
- Go‑to‑Market: Synopsys bringt breite Kundenbasis und Vertriebskanäle, NVIDIA liefert Rechen‑ und Cloud‑Verfügbarkeit über OEMs und Hyperscaler.
🔭 Neue Informationen
- Produkte: Bereits ~20 Anwendungen sind CUDA‑beschleunigt; Target‑Roadmap mit priorisierten Bereichen bis 2026 genannt.
- Investitionszweck: NVIDIA‑Zahlung soll Synopsys Optionalität und Beschleunigung der R&D‑Priorisierung geben; kein direkter GPU‑Kaufzwang.
❓ Fragen der Analysten
- Adoptionszeit: Management nennt „mehrere Jahre“ mit sichtbaren Ergebnissen bis 2026, erwartet aber größeres Momentum innerhalb 2–3 Jahren.
- Genauigkeit vs. Tempo: Diskussion über Double‑Precision‑Verifikation (FP64) — NVIDIA betont Multi‑Resolution‑Ansatz: FP64 bleibt verfügbar, beschleunigte Workflows dienen Vorverifizierung und Iteration.
- Wettbewerb & Regulierung: Partnerschaft ist nicht exklusiv; Synopsys bleibt multiplattformfähig. Investment könnte Fragen aufwerfen, Unternehmen sehen aber offene Ökosystem‑Position.
⚡ Bottom Line
- Relevanz: Die Ankündigung signalisiert eine strategische Plattformverschiebung: NVIDIA erweitert adressierbaren TAM (Total Addressable Market) jenseits von Chips, Synopsys skaliert von EDA zu systemweiten Engineering‑Workflows. Für Aktionäre bedeutet das ein langfristiges Wachstums‑Spiel auf Tool‑ und Infrastruktur‑Adoption, aber mit Auslieferungs‑ und Integrationsrisiken über mehrere Jahre.
NVIDIA — Q3 2026 Earnings Call
1. Management Discussion
Good afternoon. My name is Sarah, and I will be your conference operator today. At this time, I would like to welcome everyone to NVIDIA's Third Quarter Earnings Call. [Operator Instructions].
Toshiya Hari, you may begin your conference.
2. Question Answer
Thank you. Good afternoon, everyone, and welcome to NVIDIA's conference call for the third quarter of fiscal 2026. With me today from NVIDIA are Jensen Huang, President and Chief Executive Officer; and Colette Kress, Executive Vice President and Chief Financial Officer.
I'd like to remind you that our call is being webcast live on NVIDIA's Investor Relations website. The webcast will be available for replay until the conference call to discuss our financial results for the fourth quarter of fiscal 2026.
The content of today's call is NVIDIA's property. It can't be reproduced or transcribed without our prior written consent. During this call, we may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties, and our actual results may differ materially.
For a discussion of factors that could affect our future financial results and business, please refer to the disclosure in today's earnings release, our most recent Forms 10-K and 10-Q. And the reports that we may file on Form 8-K with the Securities and Exchange Commission.
All our statements are made as of today, November 19, 2025, based on information currently available to us. Except as required by law, we assume no obligation to update any such statements. During this call, we will discuss non-GAAP financial measures. You can find a reconciliation of these non-GAAP measures to GAAP financial measures on our website. With that, let me turn the call over to Colette.
Thank you, Toshiya. We delivered another outstanding quarter with revenue of $57 billion, up 62% year-over-year and a record sequential revenue growth of $10 billion or 22%. Our customers continue to lean into 3 platform shifts, fueling exponential growth for accelerated computing, powerful AI models and agentic applications. Yet we are still in the early innings of these transitions that will impact our work across every industry. We currently have visibility to $0.5 trillion in Blackwell and Ruben revenue from the start of this year through the end of calendar year 2026. By executing our annual product cadence and extending our performance leadership through full stack design, we believe NVIDIA will be the superior choice for the $3 trillion to $4 trillion in annual AI infrastructure build we estimate by the end of the decade. Demand for AI infrastructure continues to exceed our expectations. The clouds are sold out and our GPU installed base, both new and previous generations, including Blackwell, Hopper and Ampere is fully utilized. Record Q3 data center revenue of $51 billion increased 66% year-over-year, a significant feat at our scale. Compute grew 56% year-over-year, driven primarily by the GB300 ramp, while networking more than doubled, given the onset of NVLink scale up and robust double-digit growth across spectrum ex Ethernet and Quantum X InfiniBand.
The world hyperscalers, a trillion-dollar industry are transforming search recommendations and content understanding from classical machine learning to generative AI. NVIDIA CUDA excels at both and is the ideal platform for this transition. Driving infrastructure investment measured in hundreds of billions of dollars.
At Meta, AI recommendation systems are delivering higher quality and more relevant content, leading to more time spent on apps such as Facebook and threads, any expectations for the top CSPs and hyperscalers in 2026, aggregate CapEx have continued to increase and now sit roughly at $600 billion, more than $200 billion higher relative to the start of the year.
We see the transition to accelerate computing in generative AI across current hyperscaler workloads contributing toward roughly half of our long-term opportunity. Another growth pillar is the ongoing increase in compute spend driven by foundation model builders such as entropic Mistral, OpenAI, reflection, safe super intelligence, Thinking Machines Lab and XAI all scaling, compute aggressively to scale intelligence.
The 3 scaling malls, pretraining post training and inference remain intact, in fact, we see a positive virtuous cycle emerging whereby the 3 scaling laws and access to compute are generating better intelligence and in turn, increasing adoption and profits.
OpenAI recently shared that their weekly user base has grown to $800 million. Enterprise customers has increased to $1 million and that their gross margins were healthy. Well, Entropic recently reported that its annualized run rate revenue has reached $7 billion as of last month, up from $1 billion at the start of the year.
We are also witnessing a proliferation of agentic AI across various industries and tasks, companies such as cursor and tropic. Open evidence, EPIC and Bridge are experiencing a surge in user growth as they supercharge the existing workforce, delivering unquestionable ROI for coders and health care professionals.
The world's most important enterprise software platforms like ServiceNow, Crowdstrike, and SAP are integrating NVIDIA's accelerated computing and AI stack. Our new partner, Palantir, is supercharging the incredibly popular oncology platform with NVIDIA CUDA-X libraries and AI models for the first time.
Previously, like most enterprise software platforms, Ontology runs only on CPUs. Lowe's is leveraging the platform to build supply chain agility, reducing costs and improving customer satisfaction. Enterprises broadly are leveraging AI to boost productivity, increase efficiency and reduce cost. RBC is leveraging agent AI to drive significant analyst productivity slashing, report generation, time from hours to minutes. AI and digital twins are helping Unilever accelerate content creation by 2x and cut costs by 50%.
And Salesforce's engineering team has seen at least 30% productivity increase in new co-development after adopting cursor. This past quarter, we announced AI factory and infrastructure projects amounting to an aggregate of 5 million GPUs. This demand spans every market sovereigns, modern builders, enterprises and supercomputing centers and includes multiple landmark build-outs. X AI's Colossus 2, the world's first gigawatt scale data center, Lilly's AI factory for drug discovery, the pharmaceutical industry's most powerful data center.
And just today, AWS and Human expanded their partnership, including the deployment of up to 150,000 AI accelerators, including our GB300, X AI and Humane also announced a partnership in which the 2 will jointly develop a network of world-class GPU data centers anchored by the flagship 500-megawatt facility.
Blackwell gained further momentum in Q3. as GB300 crossed over GB200 and contributed roughly 2/3 of the total Blackwell revenue. The transition to GB300 has been less -- with production shipments to the majority to the major cloud service providers, hyperscalers and GP clouds and is already driving their growth. The Hopper platform in its 13th quarter since exception, recorded approximately $2 billion in revenue in Q3. [ A 20 ] sales were approximately [ $50 million ], sizable purchase orders never materialized in the quarter due to geopolitical issues and the increasingly competitive market in China. While we were disappointed in the current state that prevents us from shipping more competitive data center compute products to China, we are committed to continued engagement with the U.S. and China governments and will continue to advocate for America's ability to compete around the world.
To establish a sustainable leadership and position in AI computing, America must win the support of every developer and be the platform of choice for every commercial business, including those in China. The Rubin platform is on track to ramp in the second half of 2026. Powered by 7 ships, the Vera Rubin platform will once again deliver an x-factor improvement in performance relative to Blackbelt. We have received silicon back from our supply chain partners and are happy to report that NVIDIA teams across the world are executing to bring up beautifully.
Rubin is our third-generation RAC-scale system substantially redefined the manufacturability while remaining compatible with Grace Blackwell. Our supply chain data center ecosystem and cloud partners have now mastered the build to installation process of NVIDIA's RAC architecture. Our ecosystem will be ready for a fast Rubin ramp.
Our annual ex factor performance leap increases performance per dollar while driving down computing costs for our customers. The long useful life of NVIDIA's CUDA GPUs is a significant TCO advantage over accelerators. CUDA's compatibility in our massive installed base, extend the life in video systems well beyond their original estimated useful life. For more than 2 decades, we have optimized the CUDA ecosystem, improving existing workloads, accelerating new ones and increasing throughput with every software release.
Most accelerators without CUDA and NVIDIA's time-tested and versatile architecture became obsolete within a few years as model technologies evolve. Thanks to CUDA, the A100 GPUs we shipped 6 years ago are still running at full utilization today, powered by vastly improved software stack.
We have evolved over the past 25 years from a gaming GPU company to now an AI data center infrastructure company. Our ability to innovate across the CPU the GPU, networking and software and ultimately drive down cost per token is unmatched across the industry. Our networking business purpose built for AI and now the largest in the world, generated revenue of $8.2 billion, up 162% year-over-year with NVLink, InfiniBand and Spectrum x Ethernet, all contributing to growth.
We are winning in data center networking, as the majority of AI deployments now include our switches with Ethernet GPU attach rates roughly on par with InfiniBand. Meta, Microsoft, Oracle and XI are building gigawatt AI factories with Spectrum ex Ethernet switches and each will run its operating system of choice, highlighting the flexibility and openness of our platform.
We recently introduced Spectrum XGS, a scale across technology that enables gigascale AI factories And NVIDIA's the only company with AI scale up scale out and scale across platforms, reinforcing our unique position in the market as the AI infrastructure provider.
Customer interest in NVLink Fusion continues to grow. We announced a strategic collaboration with Fujitsu in October, where we will integrate Fujitsu's CPUs and NVIDIA GPUs, via and NVLink Fusion, connecting our large ecosystems. We also announced a collaboration with Intel to develop multiple generations of custom data center and PC products, connecting NVIDIA and Intel's ecosystems using NVLink.
This week at Supercomputing 25, ARM announced that it will be integrating NVLink IP for customers to build CPU SoCs that connect with NVIDIA currently on its fifth generation. NVLink is the only proven scale of technology available on the market today.
In the latest MLPerf training results, Blackwell Ultra delivered 5x faster time to train than hopper, NVIDIA swept every benchmark. Notably, NVIDIA is the only training platform to led the bridge while meeting the MLPerf strict accuracy standards. In semi-analysis, Inference Max benchmark, Blackwell achieved the highest performance and lowest total cost of ownership across every model and use case. Particularly important is Blackwell's NVLinks performance on a mixture of experts. The architecture for the world's most popular reasoning models.
On -- B2C Blackwell delivered 10x higher performance per watt and 10x lower cost per token versus H200, a huge generational leap fueled by our extreme co-design approach. NVIDIA Dynamo an open source, low latency modular inference framework has now been adopted by every major cloud service provider, leveraging Dynamo, enablement, and disaggregated inference, the resulting increase in performance of complex AI models, such as MOE models, AWS, Google Cloud, Microsoft Azure and OCI have boosted AI inference performance for enterprise cloud customers.
We are working on a strategic partnership with OpenAI focused on helping them build and deploy at least 10 gigawatts of AI data centers. In addition, we have the opportunity to invest in the company. We serve open AI through their cloud partners, Microsoft Azure, OCI and CoreWeave. We will continue to do so for the foreseeable future. As they continue to scale, we are delighted to support the company to add self-build infrastructure, and we are working towards a definitive agreement and are excited to support OpenAI's growth.
Yesterday, we celebrated an announcement with Anthropic. For the first time, Anthropic is adopting NVIDIA and we are establishing a deep technology partnership to support Anthropic fast growth. We will collaborate to optimize entropic models for CUDA and deliver the best possible performance, efficiency and TCO. We will also optimize future NVIDIA architectures for Anthropic workloads. Anthropics compute commitment is initially including up to 1 gigawatt of compute capacity with Grace Blackwell and Vera Rubin systems.
Our strategic investments in Entropic, Mistral, OpenAI, reflection, thinking machines and other represent partnerships that grow the NVIDIA CUDA AI ecosystem and enable every model to run optimally on NVIDIAs and everywhere. We will continue to invest strategically while preserving our disciplined approach to cash flow management. Physical AI is already a multibillion-dollar business addressing a multitrillion dollar opportunity on the next leg of growth for NVIDIA. Leading U.S. manufacturers and robotics innovators are leveraging NVIDIA's 3 computer architecture to train on NVIDIA, test on Omniverse's computer and deploy real-world AI and just in robotic computers. PTC and Siemens introduced new services that bring Omniverse powered digital twin workflows to their extensive installed base of customers. companies, including Belden, Caterpillar, Foxconn, Lucid Motors, Toyota, TSMC and Wistron, are building Omniverse digital twin factories to accelerate AI-driven manufacturing and automation.
Agility Robotics, Amazon Robotics, figure and skilled at AI are building our platform, tapping offerings such as NVIDIA, Cosmos, World Foundation models for development, Omniverse for simulation and validation and jesting to power next-generation intelligent robots.
We remain focused on building resiliency and redundancy in our global supply chain. Last month, in partnership with TSMC, we celebrated the first Blackwell wafer produced on U.S. soil. We will continue to work with Foxconn, Wistron, Amkor, SPIL and others to grow our presence in the U.S. over the next 4 years. Gaming revenue was $4.3 billion, up 30% year-on-year, driven by strong demand as Blackwell momentum continued. End market sell-through remains robust and channel inventories are at normal levels heading into the holiday season. Steam recently broke its concurrent user record with 42 million gamers while thousands of fans pack the GeForce Gamer Festival in South areata to celebrate 25 years of GeForce.
NVIDIA pro visualization has evolved into computers for engineers and developers, whether for graphics or for AI. Professional Visualization revenue was $760 million, up 56% year-over-year, was another record. Growth was driven by DGX Spark, the world's smallest AI supercomputer built on a small configuration of Grace Blackwell. Automotive revenue was $592 million, up 32% year-over-year, primarily driven by self-driving solutions. We are partnering with Uber to scale the world's largest Level 4 ready autonomous fleet built on the new NVIDIA Hyperion L4 robotaxi reference architecture.
Moving to the rest of the P&L. GAAP gross margins were 73.4% and non-GAAP gross margins was 73.6%, exceeding our outlook. Gross margins increased sequentially due to our data center mix, improved cycle time and cost structure. GAAP operating expenses were up 8% sequentially and up 11% on non-GAAP basis. The rope was driven by infrastructure compute as well as higher compensation and benefits and engineering development costs.
Non-GAAP effective tax rate for the third quarter was just over 17% higher than our guidance of 16.5% due to the strong U.S. revenue. On our balance sheet, inventory grew 32% quarter-over-quarter, while supply commitments increased 63% sequentially. The we are preparing for significant growth ahead and feel good about our ability to execute against our opportunity set.
Okay. Let me turn to the outlook for the fourth quarter. Total revenue is expected to be $65 billion, plus or minus 2%. At the midpoint, our outlook implies 14% sequential growth driven by continued momentum in the black well architecture. Consistent with last quarter, we are not assuming any data center compute revenue from China. GAAP and non-GAAP gross margins are expected to be 74.8% and 75%, respectively, plus or minus 50 basis points.
Looking ahead to fiscal year 2027, and input costs are on the rise, but we are working to hold gross margins in the mid-70s. GAAP and non-GAAP operating expenses are expected to be approximately $6.7 billion and $5 billion, respectively. GAAP and non-GAAP other income and expenses are expected to be an income of approximately $500 million, excluding gains and losses from nonmarketable and publicly held equity securities. GAAP and non-GAAP tax raise are expected to be 17%, plus or minus 1%, excluding any discrete items.
At this time, let me turn the call over to Jensen for him to say a few words.
Thanks, Colette. There's been a lot of talk about an AI bubble. From our vantage point, we see something very different. As a reminder, NVIDIA is unlike any other accelerator. We excel at every phase of AI from pre-training and post training to inference. And with our 2-decade investment in CUDA X acceleration libraries, we are also exceptional at science and engineering simulations, computer graphics, structured data processing, to classical machine learning.
The world is going -- is undergoing 3 massive platform shifts at once. The first time since the dawn of Moore's Law, NVIDIA is uniquely addressing each of the 3 transformations. The first transition is from CPU general purpose computing to GPU accelerated computing and Moore's Law slows. The world has a massive investment in non-AI software. From data processing to science and engineering simulations, representing hundreds of billions of dollars in compute cloud computing spend each year.
Many of these applications which ran once exclusively on CPUs are now rapidly shifting to CUDA GPUs. Accelerated computing has reached a tipping point. Secondly, AI has also reached a tipping point and is transforming existing applications while enabling entirely new ones. For existing applications, generative AI is replacing classical machine learning in search ranking, recommender systems, ad targeting, click-through prediction to content moderation. The very foundations of hyperscale infrastructure.
Meta's gym, a foundation model for ad recommendations trained on large-scale GPU clusters exemplifies this shift. In Q2, Meta reported over a 5% increase in ad conversions on Instagram and 3% gain on Facebook feed driven by generative AI-based gem, transitioning to generative AI represents substantial revenue gains for hyperscalers. Now a new wave is rising, agenetic AI systems capable of reasoning, planning and using tools from coding assistance like cursor and quad code to radiology tools like iDock, legal assistants like Harvey and AI show first like Tesla FSD and -- these systems mark the next frontier of computing, the fastest-growing companies in the world today, Open AI, Anthropic, XAI, Google, Cursor Lovable, Reple, cognition, AI, Open Evidence Abridge Tesla are pioneering agenetic AI.
So there are 3 massive platform shifts. The transition to accelerated computing is foundational and necessary essential in a post-Moore's Law era -- the transition to generative AI is transformational and necessary, supercharging existing applications and business models. And the transition to a genetic and physical AI will be revolutionary, giving rise to new applications, companies, products and services.
As you consider infrastructure investments, consider these 3 fundamental dynamics, each will contribute to infrastructure growth in the coming years. NVIDIA is chosen because our singular architecture enables all 3 transitions. And thus so, for any form and modality of AI across all industries across every phase of AI across all of the diverse computing needs in the cloud and also from cloud to enterprise to robots, 1 architecture.
Toshiya, back to you.
We will now open the call for questions. Operator, would you please poll for questions?
[Operator Instructions]. Your first question comes from Joseph Moore with Morgan Stanley.
I wonder if you could update us, you talked about the $500 billion of revenue for Blackwell plus Rubin in '25 and '26 at GTC. At that time, you talked about $150 billion of that already having been shipped. So as the quarter is wrapped up, are those still kind of the general parameters that there's $350 billion in the next kind of or 14 months or so. And I would assume over that time, you haven't seen all the demand, but there is, there's any possibility of upside to those numbers as we move forward.
Yes. Thanks, Joe. I'll start first with a response here on that. Yes, that's correct. We are working into our $500 billion forecast. And we are on track for that as we have finished some of the quarters, and now we have several quarters now in front of us to take us through the end of calendar year '26. The number will grow. And we will achieve, I'm sure, additional needs for compute that will be shippable by fiscal year '26. So we shipped $50 billion this quarter, but we would be not finished if we didn't say that we'll probably be taking more orders.
For example, just even today, our announcements with KSA. And that agreement in itself is 400,000 to 600,000 more GPUs over 3 years. Anthropic is also net new. So there's definitely an opportunity for us to have more on top of the $500 billion that we announced.
The next question comes from CJ Muse with Cantor Fitzgerald.
There's clearly a great deal of consternation around the magnitude of AI infrastructure build-outs and the ability to fund such plans and the ROI yet, at the same time, you're talking about being sold out, every stood up GP is taken. The AI world hasn't seen the enormous benefit yet from B300.never mind Rubin, and GEMINI 3 just announced Rock 5 coming soon. And so the question is this, when you look at that as the backdrop do you see a realistic path for supply to catch up with demand over the next 12 to 18 months? Or do you think it can extend beyond that time frame?
Well, as you know, we've done a really good job planning our supply chain. NVIDIA supply chain basically includes every technology company in the world. And TSMC and their packaging and our memory vendors and memory partners and all of our system ODMs have done a really good job planning with us. And we were planning for a big year. we've seen for some time, the 3 transitions that I spoke about just a second ago, accelerated computing from general-purpose computing. And it's really important to recognize that AI is not just agentic AI, but generative AI is transforming the way that hyperscalers did the work that they used to do on CPUs.
Generative AI made it possible for them to move search and recommender systems and ad recommendations and targeting, all of that has been generated -- has been moved to generative AI and still transitioning. And so whether you install NVIDIA GP is for data processing or you did it for generative AI for your recommender system or you're building it for agentic chatbots and the type of AI that most people see when they think about AI, all of those applications are accelerated by NVIDIA. And so when you -- when you look at the totality of the spend, it's really important to think about each 1 of those layers. They're all growing. They're related, but not the same, but the wonderful thing is that they all run on Video GPUs.
Simultaneously, because the quality of the AI models are improving so incredibly. The adoption of it in the different use cases, whether it's in code assistance, which NVIDIA uses fairly exhaustively, and we're not the only one. I mean, the fastest-growing application in history, a combination of cursor and cladcode and code Open AI codecs and GitHub CoPilot. These applications are the fastest-growing in history. And it's not just used for software engineers. It's used by because of wide coding is used by engineers and marketers all of e companies, supply chain planners, all over companies.
And so I think that that's just 1 example on the list goes on, whether it's open evidence and the work that they do in health care or the work that's being done in digital video editing runway in. I mean a number of really, really exciting start-ups that are taking advantage of generative AI and agenetic AI is growing quite rapidly. And not to mention we're all using it a lot more.
And so all of these exponentials , not to mention just today, I was reading a text from Denis. And he was saying that pre-training and post training are fully intact. And Gemini 3 takes advantage of the scaling loss and got to receive a huge jump in quality performance model performance. And so we're seeing all of these exponentials kind of running at the same time. And just always go back to first principles and think about what's happening from each one of the dynamics that I mentioned before, general purpose computing to accelerated computing, generative AI replacing classical machine learning and, of course, agent, which is a brand-new category.
The next question comes from Vivek Arya with Bank of America Securities.
I'm curious, what assumptions are you making on NVIDIA content per gigawatt in that $500 billion number? Because we have heard numbers as low as $25 billion per gigawatt of content as high as $30 billion or $40 billion per gigawatt. So I'm curious what power and what dollar per gig assumptions you are making as part of that $500 billion number. And then longer term, Jensen, the $3 billion to $4 trillion in data center by 2030 was mentioned. How much of that do you think will require vendor financing? And how much of that can be supported by cash flows of your large customers or governments or enterprises?
In each generation, from amper to hopper, from Hopper to Backblack, Blackwell to Ruben, our -- part of the data center increases. And hopper generation was probably something along the lines of 20-some-odd 20 to 25. Blackwell generation, Grace Blackwell particularly is probably 30 to 30 to say, 30 plus or minus and then Ruben is probably higher than that.
And in each one of these generations, the speed up is X factors. And therefore, their TCO, the customer TCO, improves by X factors, and the most important thing is, in the end, you still only have 1 gigawatt of power. One gigawatt data centers, 1 gigawatt power. And therefore, performance per watt, the efficiency of your architecture is incredibly important. And the efficiency of your architecture can't be brute force. There is no brute forcing about it. That 1 gigawatt translates directly. Your performance per watt translates directly absolutely directly to your revenues, which is the reason why choosing the right architecture matters so much now. the world doesn't have an excess of anything to squander. And so we have to be really, really -- we use this concept called codesign across our entire stack across the frameworks and models across the entire data center, even power and cooling, optimized across the entire supply chain or ecosystem.
And so each generation, our economic contribution will be greater. Our value delivered will be greater, but the most important thing is our energy efficiency per watt is going to be extraordinary, every single generation. With respect to growing into continuing to grow, our customers' financing is up to them. We are -- we see the opportunity to grow for quite some time. And remember, today, most of the focus has been on the hyperscalers.
And one of the areas that is really misunderstood about the hyperscalers is that the investment on NVIDIA GPUs not only improves their scale, speed and cost for -- from general purpose computing. That's number one because Moore's Law saw scaling has really slowed. Moore's Law is about driving cost down. It's about deflationary cost, the incredible deflationary cost of computing over time. But that has slowed. Therefore, a new approach is necessary for them to keep driving the cost down.
Going to NVIDIA GPU computing is really the best way to do so. The second is revenue boosting in their current business models, recommender systems drive the world's hyperscalers. Every single -- whether it's watching short-form videos or ready manning books or recommending the next item in your basket to recommending add to recommending news to -- it's all about recommenders. The world has -- the Internet has trillions of pieces of content, how could they possibly figure out what to put in front of you and your little tiny screen, unless they have really sophisticated recommender systems to do so.
Well, that has gone Generative AI, so the first 2 things that I've just said, hundreds of billions of dollars of CapEx that's going to have to be invested is fully cash flow funded. What is above it, therefore, is agenetic AI. This is revenue -- this is net new, net new consumption, but it's also net new applications and some of the applications I mentioned before, but these are -- these new applications are also the fastest-growing applications in history, okay? So I think that you're going to see that once people start to appreciate what is actually happening under the water, if you will, from the simplistic view of what's happening to CapEx investment, recognizing there's these 3 dynamics.
And then lastly, remember, we were just talking about the American CSPs. Each country will fund their own infrastructure. And you have multiple countries, you have multiple industries. Most of the world's industries haven't really engaged agenetic AI yet, and they're about to. All the names of companies that you know we're working with, whether it's autonomous vehicle companies or digital twins for physical AI for factories and the number of factories and warehouses being built around the world, just a number of digital biology start-ups that are being funded so that we could accelerate drug discovery. All of those different industries are now getting engaged, and they're going to do their own fundraising. And so don't just look at the hyperscalers as a way to build out for the future. You got to look at the world, you got to look at all the different industries and enterprise computing is going to fund their own industry.
The next question comes from Ben Reitzes with Melius.
Jensen, I wanted to ask you about cash. Speaking of $0.5 trillion, you may generate about $0.5 trillion in free cash flow over the next couple of years. What are your plans for that cash? How much goes to buyback versus investing in the ecosystem? And how do you look at investing in the ecosystem? I think there's just a lot of confusion out there about how these deals work and your criteria for doing those like the Anthropic, the open eyes, et cetera.
Yes, I appreciate the question. Of course, using cash to fund our growth, no company has grown at the scale that we're talking about and have the connection and the depth and the breadth of supply chain that NVIDIA has. The reason why our entire customer base can rely on us is because we've secured a really resilient supply chain, and we have the balance sheet to support them.
When we make purchases, our suppliers can take it to the bank. When we make forecast and we plan with them, they take us seriously because of our balance sheet. We're not making up the offtake. We know what our offtake is -- and because they've been planning with us for so many years, our reputation and our credibility is incredible. And so it takes really strong balance sheet to do that, to support the level of growth and the rate of growth and the magnitude associated with that. So that's number one.
The second thing, of course, we're going to continue to do stock buybacks. We're going to continue to do that. But with respect to the investments, this is really, really important work that we do. All of the investments that we've done so far, all the period, is associated with expanding the reach of CUDA expanding the ecosystem. If you look at the work that -- the investments that we did with Open AI, it's -- of course, that relationship we've had since 2016, I delivered the first AI supercomputer ever made to Open AI. And so we've had a close and wonderful relationship with OpenAI since then. And everything that OpenAI does runs on NVIDIA today. So all the clouds that they deploy in, whether it's training and inference runs NVIDIA and we love working with them.
The partnership that we have with them is one -- so that we could work even deeper from a technical perspective so that we could support their accelerated growth. This is a company that's growing incredibly fast. And don't just look at what is said in the press, look at all the ecosystem partners and all the developers that are connected to open AI, and they're all driving consumption of it. and the quality of the AI that's being produced, huge step-up since a year ago. And so the quality of response is extraordinary.
So we invest in OpenAI for a deep partnership in co-development to expand our ecosystem and support their growth. And of course, rather than giving up a share of our company, we get a share of their company. And we invested in them in 1 of the most consequential once-in-a-generation company -- one in agentic company that we have a sure. And so I fully expect that investment to translate to extraordinary returns.
Now in the case of Anthropic, this is the first time that entropic will be on NVIDIA's architecture. The first time in Anthropical NVIDIA's architecture is the second most successful AI in the world in terms of total number of users. But in enterprise, they're doing incredibly well. Cloud code is doing incredibly well. Cloud is doing incredibly well all of the world's enterprise. And now we have the opportunity to have a deep partnership with them and bringing Cloud onto the NVIDIA platform.
And so what do we have now? NVIDIA's architecture, taking a step back, NVIDIA's architecture, MVA platform is the singular platform in the world that runs every AI model. We run OpenAI, we run on Anthropic, we run XAI because of our deep partnership with Elon and AI, we were able to bring that opportunity to Saudi Arabia to the KSA so that Humane could also be hosting opportunity for XAI. We want to say we run Gemini. We run thinking machines -- let's see, what else do we run? We've run them all. And so not to mention, we run the science models, the biology models, DNA models, gene models, chemical models and all the different fields around the world. It's not just cognitive AI that the world uses AI is impacting every single industry.
And so we have the ability to the ecosystem investments that we make to partner with deeply partner on a technical basis with -- some of the best companies, most brilliant companies in the world, we are expanding the reach of our ecosystem, and we're getting a share and investment in what will be a very successful company, oftentimes once in a generation company. And so that basic -- that's our investment thesis.
The next question comes from Jim Schneider with Goldman Sachs.
In the past, you've talked about roughly 40% of your shipments tied to AI inference. I'm wondering, as you look forward into next year, where do you expect that percentage could go in a year's time? And can you maybe address the Rubin CPX product you expect to introduce next year or contextualize that, how big of the overall TAM you expect that can take? And maybe talk about some of the target customer applications for that specific product.
It is designed for long context type of workload generation. And so long context, basically, before use our generating answers, you have to read a lot, basically long context. And it could be a bunch of PDFs. It could be watching a bunch of videos, studying 3D images, so on and so forth. You have to absorb the context. And so CPX is designed for a long context type of workloads. And it's perf-per-dollar excellent. It's perf-per-dollar what is excellent. And which maybe forget the first part of the question...
In printing...
Interest, yes, there are 3 scaling laws that are scaling at the same time. The first scaling law called pretraining continues to be very effective. And the second is post training. Post-training basically has found incredible algorithms for improving an AI's ability to break a problem down and solve a problem step by step. And post training is scaling exponentially, basically, the more compute you apply to a model, the smarter it is, the more intelligent it is. And then the third is inference. Inference because of chain of thought, because of reasoning capabilities, AIs are essentially reading, thinking before it answers. And the amount of computation necessary as a result of those 3 things has gone completely exponential. I think that it's hard to know exactly what the percentage will be at any given point in time and who.
But of course, our hope is that inference is a very large part of the market because if inference is large, then what it suggests is that people are using it in more applications and they're using it more frequently. And that's -- we should all hope for inference to be very large. And this is where Grace Blackwell is just an order of magnitude better more advanced than anything in the world.
The second best platform is H200, and it's very clear now that GB300, GB200and GB300 because of NVLink 72, the scale-up network that we have. achieve. And you saw and Colette talked about in the semi analysis benchmark. It's the largest single inference benchmark ever done and GB200, NVLink 72 is 10x, 10 to 15x higher performance. And so that's a big step up. It's going to take a long time before somebody is able to take that on. And our leadership there is surely multiyear. And so I think I'm hoping that inference becomes a very big deal. Our leadership in inference is extraordinary.
The next question comes from Timothy Arcuri with UBS.
Jensen, many of your customers are pursuing behind the meter power, but like what's the single biggest bottleneck that worries you that could constrain your growth? Is it power? Or maybe it's financing or maybe it's something else like memory or even foundry?
Well, these are all issues and they're all constraints. And the reason for that, when you're growing at the rate that we are and the scale that we are, how could anything be easy? What NVIDIA is doing obviously has never been done before. And we've created a whole new industry.
Now on the one hand, we are transitioning computing from general purpose and classical or traditional computing to accelerated computing and AI. That's on the one hand. On the other hand, we created a whole new industry called AI factories. The idea that in order for software to run, you need these factories to generate it generate every single token instead of retrieving information that was pre-created. And so -- so I think this whole transition requires extraordinary scale. And all the way from the supply chain. Of course, the supply chain, we have much better visibility and control of it because obviously, we're incredibly good at managing our supply chain. We have great partners that we've worked with for 33 years.
And so the supply chain part of it, we're quite confident. Now looking down our supply chain, we've now established partnerships with so many players in land and power and shell. And of course, financing. These things -- none of these things are easy, but they're all attractable and they're all solvable things. And the most important thing that we have to do is do a good job planning we plan up the supply chain down the supply chain. We have established a whole lot of partners. And so we have a lot of routes to market. And very importantly, our architecture has to deliver the best value to the customers that we have.
And so at this point, I'm very confident that NVIDIA's architecture is the best performance for it is the best performance for what. And therefore, for any amount of energy that is delivered, our architecture will drive the most revenues. And I think the increasing rate of our success, I think that we're more successful this year at this point than we were last year at this point. the number of customers coming to us and the number of platforms coming to us after they've explored others, is increasing, not decreasing. And so I think the -- I think all of that is just -- all the things that I've been telling you over the years are really coming -- are coming through or becoming evident.
The next question comes from Stacy Rasgon with Bernstein Research.
Colette, I have some questions on margins. you said for next year, you're working to hold them in the mid-70s. So I guess, first of all, what are the biggest cost increase? Is it just memory or is it something else? What are you doing to work toward that? Is it -- how much is like cost optimizations versus prebuys versus pricing? And then also, how should we think about OpEx growth next year, given the revenues seem likely to grow materially from where we're running right now?
Thanks, Stacy. Let me see if I can start with remembering where we were with the current fiscal year that we're in. Remember, earlier this year, we indicated that through cost improvements and mix that we would exit the year in our gross margins in the merit 7 days. We've achieved that and getting ready to also execute that in Q4. So now it's time for us to communicate where are we working right now in terms of next year.
Next year, there are input prices that are well known in the industries that we need to work through. And our systems are by no means very easy to work with. There are tremendous amount of components in many different parts of it as we think about that. So we're taking all of that into account, but we do believe as we look at working again on cost improvement, cycle time and mix that we will work to try and hold at our gross margins in the mid-7 days. So that's our overall plan for gross margin.
Your second question is around OpEx. And right now, our goal in terms of OpEx is to really make sure that we are innovating with our engineering teams with all of our business teams to create more and more systems for this market. As you know, right now, we have a new architecture coming out. And that means they are quite busy in order to meet that goal. And so we're going to continue to see our investments on innovating more and more both the software, both our systems and our hard work to do so. I'll leave it turn it to Jensen if he wants to add a couple of more comments.
Yes, that's spot on. I think the only thing that I would add is remember that we plan, we forecast, we plan and we negotiate with our supply chain well in advance. Our supply chain have known for quite a long time, our requirements. And they've known for quite a long time our demand, and we've been working with them and negotiating with them for quite a long time. And so I think the recent surge obviously quite significant.
But remember, our supply chain has been working with us for a very long time. So in many cases, we've secured a lot of supply for ourselves because, obviously, they're working with the largest company in the world in doing so. And we've also been working closely with them on the financial aspects of it and securing forecasts and plans and so on and so forth. So I think all of that has worked out well for us.
Your final question comes from the line of Aaron Rakers with Wells Fargo.
Jensen, the question for you. As you think about the Anthropic deal that was announced and just the overall breadth of your customers, I'm curious if your thoughts around the role that AI ASICs or dedicated XPUs play in these architecture build-outs has changed at all? Have you seen I think you've been fairly adamant in the past that some of these programs never really see deployments. But I'm curious if we're at a point where maybe that's even changed more in favor of just GPU architecture.
Yes. Thank you very much, and I really appreciate the question. So first of all, you're not competing against teams -- excuse as a company, you're competing against teams. And the -- there just aren't that many teams in the world who are built who are extraordinary at building these incredibly complicated things.
Back in the Hopper Day and the Ampere days, we would build on GPU. That's the definition of an accelerated AI system. But today, we've got to build entire rags and -- 3 different types of switches, scale up, scale out and scale across switch. And it takes a lot more than 1 chip to build a compute node anymore. Everything about that computing system because AI needs to have memory, AI didn't use to have memory at all. Now it has to remember things, the amount of memory and context it has is gigantic. The memory architecture implication is incredible. The diversity of models from mixture of experts to dense models, to diffusion models that are aggressive not to mention biological models that are based the laws of physics, the list of different types of models have exploded in the last several years.
And so the challenge is the complexity of the problem is much higher. The diversity of AI models is incredibly, incredibly large. And so this is where, if I will say, there are 5 things that makes us special, if you will. The first thing I would say that makes us special is that we accelerate every phase of that transition. That's the first space. CUDA allows us to have CUDA-X for transitioning from general purpose to accelerated computing. We are incredibly good at generative AI. We're incredibly good at agent AI. So every single phase of that -- through every single layer of the transition, we are excellent at. You can invest in 1 architecture, use it across the board. You can use 1 architecture and not worry about the changes in the workload across those 3 phases. That's number one.
Number two, we're excellent at every phase of Everybody's always known that we're incredibly good at pretraining. We're obviously very good at post training, and we're incredibly good as it turns out at inference because inference is really, really hard. How could thinking be easy? People think that inference is one shot and therefore, it's easy, anybody could approach the market that way. But it turns out to be the hardest of all because thinking as it turns out is quite hard. We're great at every phase of AI, the second thing.
The third thing is we're now the only architecture in the world that runs every AI model, every frontier AI model, we run open source AI models incredibly well. We run science models, biology models, robotics models. We run every single month. We're the only architecture in the world that can claim that. It doesn't matter whether you're auto regressive or diffusion based. We run everything and we run it for every major platform, as I just mentioned. So we run every model.
And then the fourth thing I would say is that we're in every cloud. The reason why developers love us is because we're literally everywhere. We're in every cloud, we're in every -- we can even make you a little tiny cloud called DGX Spark. And so we're in every computer we're everywhere from cloud to on-prem to robotic systems, edge devices, PCs, you name it. One architecture, things just work, it's incredible.
And then the last thing, and this is probably the most important thing, the fifth thing is, if you are a cloud service provider, if you're a new company like Human -- if you're a new company like CoreWeave -- scales or -- OCI for that matter. The reason why NVIDIA is the best platform for you is because our offtake is so diverse. We can help you with offtake. It's not about just putting a random ASIC into a data center. Where is the offtake coming from? Where is the diversity coming from? Where is the resilience coming from? The versatility of the architecture coming from, the diversity of capability coming from NVIDIA has such incredibly good offtake because our ecosystem is so large. So these 5 things, every phase of acceleration and transition every phase of AI, every model, every cloud to on-prem. And of course, finally, it all leads to offtake.
Thank you. I will now turn the call to Toshiya Hari for closing remarks.
In closing, please note, we will be at the UBS Global Technology and AI Conference on December 2 and our earnings call to discuss the results of our fourth quarter of fiscal 2026 is scheduled for February 25. Thank you for joining us today. Operator, please go ahead and close the call.
Thank you. This concludes today's conference call. You may now disconnect.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Q3 2026 Earnings Call
NVIDIA — Q3 2026 Earnings Call
📊 Quartal auf einen Blick
- Umsatz: $57,0 Mrd. (+62% YoY; +22% vs. Vorquartal, +$10 Mrd.)
- Data Center: $51,0 Mrd. (+66% YoY) – Haupttreiber, GB300-Ramp
- Netzwerk: $8,2 Mrd. (+162% YoY)
- Brutto‑Marge: GAAP 73,4% / Non‑GAAP 73,6%
- Bestand & Commitments: Inventar +32% QoQ; Supply‑Commitments +63% QoQ
🎯 Was das Management sagt
- Drei Plattformen: NVIDIA positioniert sich als einzige Full‑Stack‑Plattform für Accelerated Computing, generative und agentische/physische AI.
- Produkt‑Roadmap: GB300 dominiert Blackwell‑Umsatz; Rubin (Vera Rubin) soll H2 2026 rampen; Fokus auf Perfomance‑pro‑Watt und TCO‑Vorteil.
- Ökosystem‑Investitionen: Strategische Kapitaleinsätze (z. B. Anthropic, OpenAI) zur Skalierung von CUDA‑Einsatz und Markt‑Offtake.
🔭 Ausblick & Guidance
- Q4‑Outlook: $65 Mrd. Umsatz ±2% (Mittelwert ≈ +14% seq.), keine Data‑Center‑Compute‑Annahmen für China.
- Margen & Opex: GAAP 74,8% / Non‑GAAP 75% ±50 bps; FY27 Bruttomargen Ziel: Mitte 70%‑Band; GAAP OpEx ≈ $6,7 Mrd., Non‑GAAP ≈ $5,0 Mrd.
- Risiken: Input‑Preise, geopolitische Beschränkungen (China‑Auslieferungen) und Infrastruktur‑Limitierungen (Strom, Finanzierung) genannt.
❓ Fragen der Analysten
- Supply vs. Nachfrage: Analysten fragten nach zeitlichem Auffangen der Nachfrage; Management verweist auf umfangreiche Vorausplanung mit TSMC und Partnern, sieht Machbarkeit, nennt aber Engpässe (Strom, Finanzierung, Komponenten).
- $0,5 Bio Annahmen: Jensen nennt Blackwell ≈ $30+/GW (Rubin höher) — wichtige Treiber: Perf‑per‑Watt und Performance‑Sprünge pro Generation.
- Kapitalallokation: Fragen zu Cash‑Verwendung (Buybacks vs. Investments) beantwortet: beides; strategische Investitionen zur Ökosystem‑Verbreiterung bleiben Priorität.
⚡ Bottom Line
- Fazit: Starke operative Dynamik: massiver Data‑Center‑Wachstum, hohe Margen und klare Roadmap (GB300→Rubin). Kernthemen für Aktionäre: weiterhin sehr hohe Nachfrage, gezielte Investitionen ins Ökosystem, aber geopolitische und infrastrukturelle Risiken (China, Energie, Input‑Kosten) bleiben relevante Unsicherheitsfaktoren.
NVIDIA — Special Call - NVIDIA Corporation
1. Management Discussion
Good morning, everyone. I'm Mylene Mangalindan, Vice President of Corporate Communications at NVIDIA. Thank you for joining us to discuss the press release we issued today regarding a collaboration between NVIDIA and Intel to jointly develop AI infrastructure and personal computing products.
With me on the call today are Jensen Huang, Founder and Chief Executive Officer of NVIDIA; and Lip-Bu Tan, Chief Executive Officer of Intel Corporation. [Operator Instructions] As a reminder, this call is being recorded.
The content of today's call is NVIDIA's property. It can't be reproduced or transcribed without prior written consent. During this call, NVIDIA and Intel may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties. For a discussion of factors that could affect their businesses, please refer to the disclosure in NVIDIA's and Intel's most recent Forms 10-K and 10-Q and the reports that they may file on Form 8-K with the Securities and Exchange Commission.
With that, let me turn the call over to Jensen.
Good morning, everyone. AI is driving a reinvention of every layer of the computing stack. 60 years ago, IBM introduced the System/360, the first general-purpose computer. It launched the modern computing era powered by Moore's Law and CPUs, programmed line by line by human hands, but general purpose computing has reached its limits. To keep advancing, we invented a new way forward. NVIDIA pioneered GPU-accelerated computing, increasing performance by orders of magnitude, tens, hundreds, thousands of times faster while dramatically improving energy and cost efficiency. That innovation opened new frontiers in science and industry, and it sparked the Big Bang of artificial intelligence. Today, we're taking the next great step.
Just moments ago, NVIDIA and Intel announced a historic partnership to jointly develop multiple generations of x86 CPUs for data centers and PC products. This collaboration will tightly couple and optimize Intel's x86 CPUs for NVIDIA's AI and accelerated computing architecture. Together, our companies will build custom Intel x86 CPUs for NVIDIA's AI infrastructure platforms, our data center platforms, bringing x86 into NVIDIA's NVLink ecosystem. And for personal computing, we're going to create new Intel x86 SoCs that integrate NVIDIA GPU chiplets, fusing the world's best CPU and GPU to redefine the PC experience.
This partnership is a recognition that computing has fundamentally changed. The era of accelerated and AI computing has arrived. Today is a very exciting day and a very big day. Intel and NVIDIA are partnering to drive it forward. I'm delighted to partner with Lip-Bu Tan, long-time friend and many of my colleagues at Intel in this great partnership, this historic partnership. And now let me turn it over to Lip-Bu to tell you about the exciting partnership we're entering into.
Yes. Thank you, Jensen, and thank you, all of you for joining. Jensen and I know each other for more than 10 -- 30 years. And I still remember that Jensen had the vision of building the system platform, software and with the CUDA and the long-term vision that you have for the company. And I have to salute him. He's done a fabulous job building that AI platform, driving the whole new market opportunity.
I'm so excited to be able to work together with Jensen to build a new era. And this is historical collaboration between the 2 companies. And I think this is a very big important milestone. I call it a game-changing opportunity that we can work together. And I would say that we are proud that NVIDIA is an investor to Intel, and thank you for supporting and confident in us and trust in us. And I think this milestone, the critical role that Intel can play in the ecosystem, and I'm grateful for the confidence that you place with us.
In terms of product perspective, and I just make 3 points just to add on to building on the Jensen comments. Number one, this collaboration is built on the core strengths of both companies. NVIDIA is a clear leader in AI accelerated computing. Intel is a leader in the data center and client PC CPU. This collaboration brings all together for the best for the industry going forward. Number two, this collaboration has unleashed the new era of x86 innovation and x86 is a foundational role to play in the next era of computing. And I'm excited about what can be created together in the NVLink stack scale solution that based on x86. And x86 client SoC with NVLink interconnect and united memory and also the advanced packaging that can bring all together and then to make it together for making the company and for the industry. And number three, all of this is great for our customer.
Our motto is very simple, make great products and delight the customer. That is what this collaboration is all about, and our team are ready and excited to work together, both teams to make it a success together.
And now I'll open up for some questions.
This concludes our prepared remarks. Operator, please open the call for questions. Thank you.
[Operator Instructions] Your first question comes from Jim Cramer of CNBC.
First, congratulations, gentlemen. Can you help us understand what the market landscape looks like with this product partnership? And how does it expand the opportunity and growth for both companies?
Yes. Thanks a lot, Jim. Lip-Bu, I'll take the first shot at it, and you could help me out. So if you look at the AI computing world, let me take it in 2 segments, the AI computing world first. These are supercomputers. And as you know, Jim, we recently introduced the scale-up NVLink 72 rack scale computers. That involves designing a custom CPU we call Vera that's tightly integrated with the GPUs, Blackwell GPUs so that we can disaggregate NVLink switches, scale it up into a rack scale system, essentially have an entire rack behaving as if it's one giant computer, one giant GPU.
Well, in order to do that, we have to really customize the CPU to do that. And so this architecture, the NVLink 72 rack scale architecture is only available for the Vera CPU that we build, the ARM CPU that we build. And for the x86 ecosystem, it's really unavailable except with server CPUs over PCI Express. And that has limitations in how far you could scale these scale-up systems. And so the first opportunity is that we can now with Intel x86 CPU, integrate it directly into NVLink ecosystem and create these rack-scale AI supercomputers.
The second thing is there's 150 million laptops sold per year. And NVIDIA's market largely targets squarely at gaming and workstation markets where the discrete GPUs are used. And we're very successful there, and we continue to grow there, and we're going to continue to grow there. There's an entire segment of the market where the CPU and the GPU are integrated, and it's integrated for form factor reasons, maybe it's for cost reasons, maybe it's for battery life reasons, all kinds of different reasons. And that segment has been largely unaddressed by NVIDIA today.
And so what the Intel team and I are doing, NVIDIA is doing is that we're creating an SoC that fuses 2 processors. It fuses the CPU and NVIDIA's GPU, RTX GPU using NVLink. And it fuses these 2 dies into essentially virtual giant SoC, and that would become essentially a new class of integrated graphics laptops that the world has never seen before. That entire -- that segment of the market is really quite rich, and it's really quite large, and it's underserved today with state-of-the-art world-class GPUs like NVIDIA is able to build.
I think Jensen, if I can just add on to it. And I think clearly, it's all about the scale and in terms of the best of GPU accelerator and then the best of 86 and then with NVLink linked together and then able to scale. And some of the markets that we both are doing well, but we can expand even more in terms of some of the application solution, the vertical market we can go after.
If I may just follow up for a second. Jensen, will you be using Intel's foundries to make high-end chips like the Grace Platform or more importantly, Vera Rubin, the kind of best-in-class semiconductors that right now you have made with Taiwan Semi?
Well, we've always evaluated Intel's foundry technology, and we're going to continue to do that. But today, this announcement is squarely focused on these custom CPUs. With this partnership, with this agreement, we're essentially going to be a major customer of Intel server CPUs. This is the very first time. At the moment, we buy CPUs, ARM-based CPUs from TSMC. And the x86-based PCI Express CPUs are sold openly in the marketplace. In the future, we will buy x86 CPUs from Intel and we would fuse it with NVLink into our rack scale system. So we're going to become a very large customer of Intel CPUs.
The second thing is that we're going to be quite a large supplier of GPU chiplets into x86, Intel x86 CPU SoCs. And so in that particular case, we're going to be a supplier into a market segment we've never addressed. In the server, we're going to be a customer, a major customer of Intels, and we'll integrate it into NVLink 72 and resell CPUs essentially. And so this is going to be a great growth business opportunity for both of us.
Your next question comes from Ian King of Bloomberg.
Thank you very much for doing this is interesting times we're living in. I wondered if you could talk a little bit about how long you've been working on this agreement. And then obviously, you talked a lot about custom solutions and custom parts. Those don't happen overnight. When might we get to the point where we're seeing devices in the end markets for sale, please?
The 2 technology teams have been discussing and architecting solutions now for probably coming up to a year. And the 2 architecture teams -- well, it's 3 architecture teams are working across, of course, the CPU architecture as well as product lines for server and PCs. And so the architecture work is fairly extensive. And the teams are really excited about the new architecture. And so the teams have been working a while, and we're excited about the announcement today.
Your next question comes from the line of Michael Acton with Financial Times.
Gentlemen, why can't you commit to Intel as a foundry for your most advanced AI chips at this point? And does this pave the way for sort of deeper manufacturing collaboration or not? Are you confident Intel is going to get there? And then secondly, I'd be interested what sort of involvement did the Trump administration have in this agreement, if any?
The Trump administration had no involvement in this partnership at all, and they would have been very supportive, of course. And today, I had the opportunity to tell Secretary Lutnick, and he was very excited and very supportive of seeing American technology companies working together. I think Lip-Bu and I would both say that TSMC is a world-class foundry. And in fact, we're both very successful customers of TSMCs. And the capabilities of TSMC from process technology, their rhythm of execution, the scale of their capacity and infrastructure, the agility of their business operations and just all of the magic that comes together for being a world-class foundry supporting customers of such diverse needs is really quite extraordinary. And I can't -- you just can't overstate the magic that is TSMC. But today, our conversation today, our partnership today is completely focused.
It is 100% focused on the custom CPUs that we're building for the data center that now has NVLink capabilities and can connect to the NVLink and the NVIDIA AI supercomputing ecosystem. And it's completely focused on mobile SoCs for consumer PCs that now fuse the world's best CPU and the world's best GPU for consumer products. That segment, the first segment, of course, the data center CPU segment is probably something along the lines of $30 billion a year or so. In the case of -- and this is going to -- this combination of Intel and our technology is going to address a fairly significant swath of that because it is the fastest-growing segment.
And we all can agree that the future computing is going to be about AI through and through. And so this is an exciting partnership for the data center market. And then for the consumer market, it's 150 million laptops sold each year, and we're now going to combine the best CPU and the best GPU. And so it's really, really exciting. Lip-Bu?
Yes. I think clearly, this is historical. And this is also my 6 months as Intel CEO. And from day 1, Jensen and I will work on it, and then we accelerate that process and then two teams work together to this game-changing opportunity. It's a deep partnership, and we're looking forward for multiple ways we work together.
Next question comes from Laura Bratton with Yahoo Finance.
I'm really curious about which manufacturing process these new CPUs will be made on. I know that Lip-Bu, you said that 14A is only going to go forward if it has meaningful volume or customer commitment. I just wonder, can you comment about, yes, which manufacturing process the CPUs would be made on? And I know you said this is strictly a product announcement, but can you all comment at all on whether this might pave the way for NVIDIA to collaborate with Intel Foundry services in the future? And that's all.
Yes. I think this announcement is more on the product collaborations. And clearly, like Jensen mentioned, TSMC has been a great partner, long-time partner for NVIDIA and also for Intel. So we're going to continue doing that. And I think this kind of -- in terms of process, I think later on, we can describe more. But I think right now, we are focused on collaborations. And then a certain date, then we can have more announcement down the road when the product is ready.
Yes. I think it's safe to say that the partnership that we're entering into is going to address some $25 billion, $50 billion of annual opportunity. And so this is a very significant partnership, and we're completely focused on that. With -- one of the things that I will say is that our ARM road map is going to continue. And we're committed -- we're fully committed to the ARM road map. We have lots and lots of customers for ARM. We're building the next generation of Vera -- the next generation of Grace called Vera, and we have the next generation after that. We have exciting CPUs that we're building based on ARM.
We're building ARM, of course, robotics processors. Our latest one is called Thor. It's used for robots and of course, for autonomous driving. We also have a new ARM product that's called N1. And that product is -- that processor is going to go into the DGX Spark and many other versions of products like that. And so we're super excited about the ARM road map, and this doesn't affect any of that. NVIDIA's architecture accelerated computing covers just about every CPU architecture. And our most important interest is for whatever general purpose computing platform that has market reach, we would like to be able to accelerate it to its fullest capability.
And so today, we have the benefit of partnering with Intel on a CPU platform that unquestionably has the largest enterprise, industrial space, cloud, consumer footprint of any CPU in the world. And so a really exciting partnership, and we're going to revolutionize this general purpose computing platform by adding and fusing the NVIDIA accelerated computing and AI computing architectures.
The next question comes from Stephen Nellis with Reuters News.
I had a question for Jensen and Lip-Bu. Jensen, why did you feel it was appropriate or necessary to also make an equity investment in Intel along with this product collaboration? And Lip-Bu, this is now sort of a string of equity investments we've seen from folks, and we expect other ones from potential partners or foundry customers in the future?
I appreciate that question because we thought it was going to be such an incredible investment. This is a big partnership, and we think it's going to be fantastic for Intel. It's going to be fantastic for us. And we're building revolutionary products that's going to address some $50 billion annual market. And so how could we, on the one hand, be excited about the products and how revolutionary they are. On the other hand, not be excited about the opportunities ahead. And so we're delighted to be a shareholder. We're delighted to have invested in Intel. And the return on that investment is going to be fantastic, both, of course, in our own business, but also in our equity share of Intel. And I think it's going to be fantastic for Intel. It just reflects how excited we are about this partnership.
Thank you, Jensen. I think to answer your question, I think clearly, as I mentioned, my top priority, top 10 priority, one of them is to strengthen my balance sheet and that's -- I'm focused on that. And then secondly, in this particular situation, I think, first of all, I'd like to thank Jensen for the confidence in me. And then our team and Intel will work really hard to make it a good return for you. And more important, it's a strategic partnership to drive the products and go to market together to win. So that, I think, is very meaningful for us.
The next question comes from Robbie Whelan with WSJ.
Congratulations. There have been a lot of questions about whether or not NVIDIA will someday use Intel as its foundry partner for its most advanced AI chips. But with these CPUs that we're talking about under this partnership, will TSMC be fabricating most of those CPUs for the foreseeable future? And then also just really quickly, what's the bigger addressable market under this partnership, data centers versus PCs and edge computing? In other words, do you expect to be making more CPUs for PCs under this new arrangement or more CPUs for data centers?
Lip-Bu, do you want to take the first part?
Sure. So I think in terms of the -- as we mentioned earlier, this is more the product collaboration announcement. We both saw a lot of respect for TSMC, C. C. Wei, Morris Chang and we continue to work with them. In terms of the Intel Foundry, we continue to make progress. And then in terms of the yield performance, 18A, 14A, clearly, we want to qualify and then we're going to decide whether this is the right one for doing our foundry. So I think we continue to improve at the right time, Jensen and I will review that. But overall, I think we're going to continue driving our success on the process side and then win the customer confidence and trust and then one step at a time.
Yes. I think one of the things that I would also add is that Intel has the Foveros multi-technology packaging capability. And it's really enabling here. And the reason for that is because as we all know, NVIDIA's GPU technology is based on TSMC's foundry. And this is one of the extraordinary things that you can do, connecting NVIDIA's GPU die chiplet with Intel's CPUs in a multi-technology packaging capability and multiprocess packaging technology. And so it's really a fabulous way of mixing and matching technology. And that's one of the reasons why we're going to be able to innovate so quickly and build these incredibly complex systems and deliver it as multi-chiplet systems packages. And so I'm really excited about that. In terms of the size -- go ahead.
I think, Jensen, you brought up a good point about our advanced packaging, Foveros and also the EMIB is a really good technology, and we will definitely continue to refine it and make sure that's reliable and the yield improvement. And so I think that part definitely we will explore the collaboration opportunity.
Yes. With respect to the size of the market, the data center market and the PC market are both large. And we're going to build revolutionary products, first-of-its-kind products, nothing of its kind has ever been built before for the x86 market. And so I think the -- if my recollection is correct, the data center CPU market is about $25 billion or so annually. And just the notebook market is 150 million notebooks sold each year. And so that kind of gives you a sense of the scale of the work that we're going to do here.
We're going to address the -- in terms of the consumer market, we're going to address a vast majority of that consumer PC market, consumer PC notebook market. And with respect to the data center, we're going to bring NVLink, which is the scale-up interconnect, the fabric of NVIDIA, the computing fabric of our company. We're going to bring that capability to Intel. And so I think these are going to be revolutionary products, and we're looking -- I know that all of us working on it are super excited about it. The architects working on it are super excited about it. And so we're looking forward to telling you more about it over time.
The next question comes from Edward Ludlow with Bloomberg.
Jensen, I appreciate you talked a lot about the addressable market. Could you explain how NVIDIA participates in the economics of x86 because you make money on ARM-based CPU, right? So if you could explain how it will work for NVIDIA on the top and bottom line, that would be great for. And Lip-Bu, congratulations. You've been in Silicon Valley, so to speak, for a long time, right? If you stand on the Intel roof, you can look across the freeway at NVIDIA. Would you just explain the culture and sentiment inside Intel today in reaction to this new partnership and what it means for the trajectory of you and your staff?
In the case of the data center the server CPU, it's like us buying Grace CPU from TSMC, integrating it into our rack scale systems and selling that. It's basically the same idea. So we're now instead of -- for x86, we don't buy any CPUs. We let the market sort it out. And the CPUs are really sold as discrete servers, separate servers that are then connected with our GPUs in the data center. And that architecture, basically using PCI Express retimers and things like that, basically PCI Express retimers and repeaters essentially.
Instead of building servers like that, that really don't have the ability to scale up to NVLink 72 large fabric systems, we now are able to do that with Intel x86 CPUs. And so we'll buy those CPUs from Intel and then we'll connect it into super chips that then becomes our compute nodes that then gets integrated into a rack scale AI supercomputer. In the case of our consumer PC, we will sell -- the current idea is to sell NVIDIA's GPU chiplet either in a pass-through way with Intel or sold to Intel. And that is then packaged into an SoC. And so we buy a server die, server chip on the one hand, I guess, server chip on the one hand. We sell a GPU chiplet on the other hand. And in both cases, it expands the market for Intel very significantly and it expands the market for NVIDIA as well.
And then in terms of your question about the culture changes at Intel, and first of all, this is a new Intel culture I'm trying to build, and it's going to be engineering focused and extending my relationship with Jensen from a Cadence NVIDIA partnership in terms of drive innovation. And now we are so excited to have this partnership to collaborating to the engineering on x86 and also the GPU accelerator and then on the AI side and with NVLink. So I think there's a lot of engineering collaboration together. The culture I want to have is really lean, fast moving and so that we can match up with the Jensen fast-moving culture. So I think that's something that I'm looking forward to build that culture that can match each other to drive the best solution for the market.
[Operator Instructions] Your next question comes from Kristina Partsinevelos with CNBC.
Jensen, you spoke about the Arm relationship, which is great. So it's continuing and maybe the reaction today was a little overdone. But given SoftBank's position across Arm, Intel and now your partnership with Intel, is there just like some type of broader coordination that I'm missing here? And then I just have a follow-up on China.
Yes. This today should have no impact on Arm. And with respect to the second question, not that I'm aware of. There were no communications with anybody else, except for between Lip-Bu, myself and the technical teams that we're working on this partnership. And we kept it really quiet. Obviously, it's a very substantial partnership. This is going to expand the market opportunity for Intel in AI infrastructure that is largely unexposed to them today, and it's going to expose to Intel in the consumer notebook market where really exquisite GPUs are necessary. And so these 2 markets are unexposed to Intel today, and it's going to be brand-new growth markets for Intel.
And so I think -- and all the due diligence that we've -- between our 2 teams and all the work that we did gave us a lot of confidence about the future of Intel. And so we're really betting on -- well, I've become quite a significant shareholder because we believe in this, and we have confidence in them to create -- to partner with us to create these amazing products. But all of these discussions were -- had no relationship to any of the things that you were talking about.
And just a quick follow-up. Intel definitely, we know faces different regulatory constraints than you do. And all the stories that are coming out of China is just constantly on CNBC. We're talking about all the time. But is that part of the calculus here that Intel's difference in regulatory constraints would help you in the medium to long term?
I don't think there's any relationship there either. And I don't think there'll be any impact either way.
The next question comes from [indiscernible] with [indiscernible] Information.
2. Question Answer
My question is for Jensen. I was curious what types of NVIDIA customers are interested in the x86 architecture for the CPU? And do you expect any customers that currently use the ARM CPU to switch to x86 in the future?
ARM in the world's CSPs is growing. But the vast majority of the world's CSPs are still x86. The vast majority of cloud instances for enterprise users are still x86. And so I think the x86 footprint is still quite large. And NVIDIA addresses it in 1 of 2 ways. In the case of ARM, we could scale up to a full rack scale NVLink system. In the case of x86, we address it through external PCI Express retimers, and we scale up to NVLink 8. And so in the case of x86, we scale up to NVLink 8. In the case of ARM, we scale up to NVLink 72. And so now we could -- with x86 scale up also to NVLink 72. And so I think this is a really great growth opportunity for both of us. And it also creates a product that for many customers who are still x86 based and basically, the vast majority of the world's enterprise is still x86 based, they now have state-of-the-art AI infrastructure.
The next question comes from Eva Du with Washington Post.
Since President Trump, manufacturing in the United States is such a big push for him and both your companies have committed to multibillion-dollar investments in this area. Could you talk a bit about what are realistically the prospects of manufacturing your chips in America? Like what proportion of your chips do you expect to be made in the U.S. in the near future? And what are sort of the challenges to doing more of the production here?
Lip-Bu, would you like to go first?
Sure. Clearly, we -- and clearly, we like Trump -- President Trump focus on manufacturing in U.S. But I think it's important to address that and then the opportunity we have in front of us. But meanwhile, we also have the footprint for Intel globally. And so in the way we just meet customer requirement, include the NVIDIA and then so that they have the flexibility, which best suitable for them. And then meanwhile, we continue to improve our yield performance. And also the other part is the advanced packaging we just talked about. I think it's a great opportunity for both of us.
This concludes the question-and-answer session. I'll turn the call to Mylene for closing remarks.
I want to thank all of you for joining us today in this historic day. It's a historic partnership. I want to thank Lip-Bu for his leadership and the management team of Intel that we've had the great privilege of working with architecting two really exciting product lines and architecting this partnership. We're going to go and address a new computing era where accelerated computing and AI are essential to every aspect of computing, whether it's in the data center and the cloud or in mobile devices and personal computers. I'm super excited to start the projects and our partnership. I want to thank Lip-Bu again and the Intel team for this exciting announcement and this exciting new partnership. Lip-Bu?
Yes. Thank you. First of all, I want to thank Jensen and NVIDIA for the trust and support of Intel, and we will work hard to make sure that this will be a great success for both of us. I think more exciting for me is the collaboration, the best of the acceleration AI and also the x86 and then using the NVLink to scale. And I think this is a new compute platform that we are moving forward. And I'm super excited about the opportunity in front of us and a lot of execution, we're going to be doing that. And then stay tuned. We're going to update you when the time come. But I just want to thank all of you for attending this announcement together.
Thank you, Lip-Bu.
Thank you.
Thanks, everybody.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Special Call - NVIDIA Corporation
NVIDIA — Special Call - NVIDIA Corporation
📣 Kernbotschaft
- Kernaussage: NVIDIA und Intel kündigen eine strategische Partnerschaft zur gemeinsamen Entwicklung kundenspezifischer x86-CPUs für Rechenzentren und integrierter Intel x86 System-on-Chips (SoC, System-on-Chip) mit NVIDIA-GPU‑Chiplets verbunden über NVLink (hochperformante Interconnect‑Technologie) an; NVIDIA tätigt zudem eine Minderheitsbeteiligung an Intel.
🎯 Strategische Highlights
- Server: Entwicklung kundenspezifischer Intel‑x86‑CPUs, die in das NVLink‑Ökosystem integriert werden sollen, um rack‑skalierbare AI‑Supercomputer (NVLink 72‑Architektur) zu ermöglichen.
- PC‑SoC: Gemeinsame SoCs für Notebooks/Personal Computing, die Intel‑CPU und NVIDIA‑GPU‑Chiplets per NVLink/Advanced Packaging fusionieren sollen, Ziel: neue Klasse integrierter High‑Performance‑Laptops.
- Go‑to‑Market: NVIDIA wird großer Abnehmer von Intel‑Server‑CPUs und Lieferant von GPU‑Chiplets; beide Firmen sehen ein adressierbares Marktvolumen im Bereich Dutzender Milliarden USD jährlich (Intel/NVIDIA‑Angaben).
🔭 Neue Informationen
- Neu: Formalisierte Produktkooperation (x86‑Server‑CPUs + integrierte PC‑SoCs) und NVIDIA‑Eigenkapitalbeteiligung an Intel. Keine konkrete Serienfertigungstermin‑Angabe; Teams arbeiten seit ~einem Jahr, Zeitpläne bleiben vage.
❓ Fragen der Analysten
- Foundry/Produktion: Klärung offen — TSMC bleibt wichtiger Partner; Intel‑Foundry‑Nutzung wird geprüft, ist derzeit nicht festgelegt (Yield/Prozessreife als Entscheidungsfaktor).
- Timelines: Architekturen werden intern entwickelt, aber konkrete Markteinführungsdaten fehlen; Analysten drängten auf erwartete Time‑to‑Market‑Angaben.
- Finanzen & Wirtschaft: Wie NVIDIA wirtschaftlich an x86‑Produkten partizipiert (Kauf von CPUs vs. Verkauf von GPU‑Chiplets) wurde skizziert, aber Umsatz‑/Margen‑Effekte bleiben unquantifiziert; Equity‑Investment als Alignement‑Signal.
⚡ Bottom Line
- Bedeutung: Strategisch hochrelevante Partnerschaft, die NVIDIA signifikant in das x86‑Ökosystem einbindet und Intel Zugang zu GPU‑basiertem AI‑Wachstum gibt. Kurzfristig keine Guidance‑Anpassungen oder klare Umsatzzahlen; mittelfristig großes Upside, aber Ausführungs‑, Foundry‑ und Timing‑Risiken bleiben entscheidend.
NVIDIA — Goldman Sachs Communacopia + Technology Conference 2025
1. Question Answer
Okay. Good afternoon, everybody. Thanks for being here. Welcome to the Goldman Sachs Communicopia & Technology Conference. My name is Jim Schneider. I'm the semiconductor analyst here at Goldman Sachs. It's my pleasure, sincerely honored to welcome NVIDIA and CFO, Colette Kress to the stage today. Welcome, Colette.
Thank you. Happy to be here.
So maybe to start off on a few things from what we heard from you a couple of weeks ago when you reported on your latest earnings call, you stated that the data center infrastructure capital requirements could reach $3 trillion to $4 trillion by the end of the decade. I think many investors were doing a sound check at that point. Maybe give us a little bit of context for that statement and sort of impact or share some of the major building blocks to that?
Okay. So first, let me remind everybody, we might be making forward-looking statements, and I kindly remind you to look on our website for our disclosures as well as our 10-Q and other types of reporting to help you.
Let me kind of talk about since the last time we were here. The last time we were here a year ago, and congratulations on your new role in helping us here. This is our time that we came in here to discuss a lot of folks in terms of our Blackwell architecture, really interested to hear whether or not that transition and whether or not there would be an air pocket or such in between that period. So we have obviously safely been able slack well into the market, not only just our current Blackwell and our GB200 systems, but we also now have our Ultra system and our Blackwell Ultra systems. So that is one piece that has quite changed during that period of time.
Additionally, as well, I think at this time, we often talked about -- is there going to be a future need for compute, have we maxed out in terms of the need to compute. A lot of discussion in terms of pretraining post training and inferencing is what has been necessary. As you can see over this last 12 months, a tremendous need still out there in terms of compute. And one of the most important pieces that we're seeing today that is increasing more and more of the compute, is those reasoning models. And we'll talk a little bit more about those as we go forward.
And then lastly, a lot of discussion in terms of why the 1-year cadence, why do you see? What will it help? Can that be something that you can execute? Our 1-year cadence is going quite well, so tremendously important to us, and in terms of our customers that we can keep that innovation and advance as fast as possible, which has just allowed us multiple, multiple different architectures in place.
Most importantly, we still have our Vera Rubin getting ready. We have indicated not only that our Rubin chip is available. But remember, there are six chips in our Vera Rubin that will be coming to market. And those are doing just fine and have taped out. And now it is our time to mature those before they again go into market.
Well, let's step back to the question regarding the $3 trillion by the end of the decade. What does that mean? What are we thinking? It was really a purpose to help the full ecosystem understand how important this market and this transformation is, we are really talking about a new computing platform for the next decades going forward. This isn't just about an AI solution. We really need to transform from something that's been here for more than 20 to 30 years as existing of a standard computing platform. When you think about accelerated computing and AI, it is that large transformation.
If you recall from our GTC, we had talked about that you will likely see in a couple of years going forward, meet that $1 trillion mark in terms of capital needed to fuel the data center infrastructure that is going to be built. We're on track to do that. Even our CSPs today have literally doubled the amount of capital that they are spending from what they've had just two years ago. But there are only one part. You can look at those four top CSPs. You can look at the Mag 7, you can look at the AI labs that are being built [Audio Gap] So we can do from a computing perspective, yes, we're right there and right there behind that.
But there's other pieces in terms of the power, getting the data center ready. Most data centers are usually thought through and discussed more than three years out and are beginning that work. So keeping the ecosystem understanding of what we see and the likelihood was a very important exercise.
Very good. That's helpful context. Maybe following on to that, when you reported this last quarter, data center had really good growth in Q2, and you guided to strong growth again in Q3 even without the contribution from China in the numbers.
And there were some moving pieces in there between networking, compute and other different products. So maybe help us understand what's driving that demand and the different components that are kind of contributing to that strong growth outlook?
Yes. So we're really talking about our data center revenue as a whole, stripping out the H20 out of our Q1 results, Q2 results and really looking at the data center, inclusive of the compute as well as the networking. In Q2, that was a sequential growth of 12%. And what we're targeting right now for Q3 in our outlook is a 17% growth sequentially. So you're already starting to see a surge up in terms of the demand that we see forward and getting ready for it.
In our Q2, there's a lot of different parts of it. Not only have we continued what we were doing with our Blackwell GB200, our B200, but we were also at scale with our GB300 Ultra lot of discussion that says, "I didn't know that would be actually a big part. It was seamless. It was a seamless transition that many people didn't understand the amount of scale and volume that we were actually able to put into market as well. So both of those are moving quite well, and you'll see more of that in Q3, still shipping both of our GB200 and GB300.
So keep in mind, our networking is, for many of our systems, those Grace Blackwell systems as a whole, includes our NVLink, which we will again continue to talk about further on how important that new transition was. But NVLink is also incorporated in our networking number. So that is usually running side by side in terms of what we can see into the compute. But there's also additional important areas of where we had focused.
We knew that Ethernet for enterprises was very important, but we built the enterprise focus on Ethernet for AI in what we do. So that is also doing quite well, has a quite good attach rate in terms of a lot of the systems that we're doing, really focusing on everything at the data center scale and that completion. And it was very successful and grew not only quarter-over-quarter, but year-over-year.
Additionally, InfiniBand. It is the gold standard, continues to be the gold standard. And focusing on a lot of the supercomputing and we have a new offering, and that was a tremendous sequential growth, nearly almost doubling sequentially over there. So a lot of great things within our Q2 and more to come as we look in terms of Q3.
Very good. Now the H20 is your product for the China market. Maybe talk to us about any update you can provide on demand for the H20? What needs to happen for you to ship that product in Q3 to China? And maybe just talk about the broader confidence in your China business overall?
Yes, we did receive a license approval and have received licenses for several of our key customers in China. And we do want that opportunity to complete that and actually ship the H20 architecture to them. Right now, there is still in this position right now, a little geopolitical situation that we need to work through between the two governments. Our customers in China do want to make sure that our China government is also very well received in terms of receiving the H20 to them. But we do believe there is a strong possibility that this will occur.
And so it could add additional revenue. It's still hard to determine how much within the quarter. We talked about it being about a $2 billion to $5 billion potential opportunity if we can get through that geopolitical statement.
Very good. Let's touch on the prospect of competition or potential competition for a moment, if we could. There's been an ongoing debate, as you know, in some of your meetings about the role of ASICs for both the training and for the inference markets. How do you see your competitive position evolving for both these kind of workloads?
So focusing on inferencing and training, this -- it's been interesting to watch the dynamics that says we understand your training performance for so many years in building large language models or recommender engines has been important. But the inferencing is very essential as well. And the two are not necessarily a separate type of workload. As we move forward and see the work that's being done even on the reasoning models, you will likely be continuing to do a lot of the post training along with that reasoning model to assure you can get the right response and we will likely be working with multiple different types of models.
So where we had created a data center scale system, was to focus on what we knew was such an important industry regarding the inferencing, much larger than anything we'll see in the future. And a lot of that fuel of growth on inferencing has been fueled by more individuals just using AI solutions, but also more and more token generation types of inferencing required in terms of the reasoning models. Now why is that reasoning model such an important piece, is because if it can reason, if it can get to a high level of performance on reasoning, it can do work for us.
And that work is really speaking about the agentic AI that is going to be in front of us. And so we are looking at our position of creating a data center scale that can be the most performant, but the most performant per watt and the most performant per dollars as well. That wattage is such an important thing.
Right now, you can decide whether or not capital or power is more important. In respect to our -- they both are tremendously important. But when you are purchasing any type of large system as we are, you have to keep in mind, you will be using power throughout the journey of owning that full cluster, 4, 6 years or even further.
So having that high performance is going to be very important to make sure you are properly addressing that power that is going to be needed for that. So we stand very strong in terms of how we thought about that transition and moving to a full data center scale solution.
Some of them focus that in terms of it's a rack scale type of capability, where we put all of the different chips together so that they would be working together, be optimized together in terms of the right performance. So we feel very good with our plan and that transition, but a very big transition for us.
Okay. Very good. Now as AI becomes more mainstream, [indiscernible] the reason, I think that some workloads might be a little bit less compute-intensive going forward and could run on multiple versions of NVIDIA architectures. How do you think about your market share in those kind of simpler AI workloads or things that might be tied to smaller models over time?
So your compute that you'll put together at any type of enterprise is probably a full AI factory where all of your data and all of your pieces are together. What that enables you to do is to continue with all different type of traffic, all different types of requests, needs to all be contained within the same, but you are more efficient with that all together.
So we believe these AI factories will continue to grow and be a significant piece of how enterprises are thinking about their data and those things pulling together. I don't think it is about, hey, I've got a smaller model. They will all be trying to pull all of their data together. And it's not just going to be about a small amount of data. They will continue to probably connect to many different types of systems going forward as well.
Okay. Interesting. So you feel like you can sort of maintain that sort of share leadership across the different levels of the model?
It will be -- what you want to do is use your best resources to manage that full data center and keeping the strongest performance and putting all those things together to happen collectively is going to be the right response.
Yes. Very good. Now NVIDIA's platform I think it's pretty clear or almost indisputable at this point that NVIDIA's platform continues to lead the market in terms of performance. And also you have an annual product cadence as you mentioned earlier. Can you maybe talk about some of the sort of economic benefits that annual product cadence brings to your customers?
The economic benefits that we are already seeing from that 1-year cadence is the speed in terms of what AI is evolving, every single time we may be here or any different point in time. There is more and new connectivity, more in new types of models and pieces that need to be put together by continuing to advance and innovate at a fast speed, we keep folks really focused on not worrying about which version they're on, because you're going to still be in line with that cadence of increasing each and every time.
What we find is the having assured that your power is being utilized effectively, whether it is the most current version or is still holding the last version, which again is still tremendously performant has been helpful to them to continue to have different modes of getting ready for it.
We find our newest architecture advances to some of the larger and bigger models, but many of them, for example, from the GB300 was a statement where GB versions were being used initially right off the gate with inferencing as they saw the 30x improvement from just where we were with Hopper in our Grace Blackwell versions of that. That is a tremendous performance improvement, but that also enables Hopper to continue also doing significant work that they need in preparing for that inferencing stage of the things. So keeping the Hoppers also moving in terms of the Grace Blackwell has worked quite well.
And then just in terms of a product perspective with that annual cadence, it seems to me at least that Blackwell was a pretty big performance increase versus Hopper. Going forward, how should we think about the performance increase offered by Rubin relative to Blackwell's?
Okay. That's correct. Rubin is on a path and that 1-year cadence is going to be a journey that we're ready to take on with Rubin. So Vera Rubin, six chips, all of them typed out and in terms of a maturing. This will be another point that we can continue to advance some of the most important pieces. As you heard on our earnings call, we announced really our focus in terms of scale out or scale up and importantly, the scale across. So that's a new thing to think about and in addition to what we are building, which will be a great advancement going forward.
One of our most important pieces that we created when we transformed our Grace Blackwell to moving to a data center scale system is our NVLink. Our NVLink at this point is a fifth generation NVLink and an important piece that enabled not just 8 GPUs together but now you can do a full rack scale and it currently at 72 GPUs. That was a huge efficiency. That's what enabled such a fast and performance move as we went to Blackwell and is probably, by far, those two things together, the scale out perspective and NVLink truly important to our work.
Very good. So that's -- so you would contextualize in terms of scale out and scale across being like one of the big drivers of what Rubin is going to provide?
Right. We'll see. We'll see when it gets there.
Okay, very good. So you also reported very strong growth in networking this quarter, probably much stronger than most investors, including myself, had expected. What is driving that? And why was the networking growth so strong. Is that something where -- is it a forward indicator of what's to come in on the compute side of things? And do you think about the networking build as sort of prebuilding to sort of fill and compute later on? Or how do you contextualize that for investors?
The way you want to think about networking. Networking is a timing of when it arrives. It's very well connected right now in terms of how we designed our compute infrastructure. And as we indicated, NVLink is incorporated, both in networking, but it's a very important part of those systems. So the growth rate that you see in terms of computing and network should long term be approximately a continuous growth rate or said differently, total data center revenue has that continuous growth.
In some cases, though, there's a timing in terms of when the networking is received. A lot of times your networking is before, you actually ship the compute if they need a wallpaper, the entire data center with that networking. So those are just some of the short term in terms of timing perspective. But in the big picture, compute and networking are very important and are both growing quite consistently.
You mentioned NVLink. That's been, I think, a key competitive moat for NVIDIA for some time now relative to other competitive technologies like PCI Express and others. How do you sort of view the opportunity for NVIDIA in terms of opening this up -- this technology up to competitors with NVLink fusion. And how do you think about both the opportunity side and if there's any risk on the downside for you?
Yes. We're still operating both the NVLink and the PCIe Express. So keep in mind, many of our enterprises as we began with our Blackwell architecture. The Blackwell architecture, many of those were liquid-cooled types of systems, very energy efficient and not everybody was ready to build out that.
So we still have an enterprise PCIe version, which allows many of those different enterprises and certain industries to use that, and we'll continue with the PCIe version. But yes, the scaling of NVLink has been such a huge value to so many of the AI lab builders as well as the tremendous amount of token generations that they need.
So we came into NVLink Fusion and the possibility to say, how do you continue to maintain a focus in terms of our platform, the benefits and also adding many other different characteristics in the data center within there. Maybe they could bolt on line with what we have in terms of our data center. We'd be happy to allow them to be a part of our full infrastructure, and that's what NVLink Fusion is enabling.
Now NVLink Fusion, a lot of interest in terms of many different other chips that could be added to that. And I think we're going to see more to talk about in the future about how that's doing.
Okay. Very good. Now I think it's fair to say that if you talk to anybody in the supply chain, they're very impressed by your ability to scale supply for your products over the past couple of years or so. How have you accomplish this? And what lessons have you learned? And then how do you ensure your supply chain that can actually keep up with NVIDIA's rate of progress?
Okay. The supply chain has been such an important part of our success. We -- as you know, with our GTC Taiwan work that we did, we actually really wanted to see many of them because they have spent so much time over the last 30 years, both understanding, but also have been truly inspirational in terms of how they feel they could raise the overall supply for us.
Supply is not about just requesting supply. Many of them also have to think about additional capacity and what we mean by additional factory, complete different manufacturing line. And then we also have to think about the resiliency and the redundancy that we can use multiple different types of suppliers to build the size that we need in terms of compute, both now and as we move forward.
So those folks have been working with us and staying ahead of both understanding where our architecture is going so they can build and work there. We believe that partnership has been one of our largest success factors that I don't think many other companies could have built that supply chain. It's a question of who calls who first in the morning, but the suppliers are all here to help us. Each and every way around there. It's not just about ordering. It really is about just scaling the entire operation is what we're working on.
Yes. Makes sense. And then you talked about Rubin and the performance advantages it provides. Maybe just give us a quick status check on how Rubin is progressing and then downstream how NVIDIA is sort of preparing or helping data centers get ready for your latest technologies and kind of preparing for acceptance in the ultimate end user environment?
Yes. We've been very open about our architectures and what we believe is our cadence and which is happening. The closer we get to it, we provide more and more information, and we've been sharing and getting a great understanding of the customers. What does that enable? That enables the right type of planning.
We know these data centers to stand up from the very, very beginning to the end, you've got about three years. They need to understand what's available, when will be available. So right now, our Vera Rubin and its chips as they mature. We already have had discussions to where we probably will see several gigawatt needs for our Vera Rubin. So we've already likely seen that and pencil that in.
So even way before it's even ready to go to market. We already are seeing gigawatts worth of needs as we go forward. So we feel this is a great 1-year kind of cadence program because it does help them run efficiently in terms of further data centers.
Okay. I think Jen-Hsun in the past has mentioned that the useful life of NVIDIA GPUs is quite long. How do you think about the useful life of the sort of the chips that were deployed very early in 2023? And when should we expect or might expect a replacement cycle for that generation of products to start kicking in?
Yes. We're in '25 and they still have some things for '23. What we see most is there's one place of it which is a depreciable life. The depreciable life is one piece of it. It just means what you have chosen from an accounting type of life. Many of them are probably at about 4 to 6 years. Many of them will continue keeping it in their data center because it is high performance still. Sure, the next generation is more. But if you are going to value your power, you do have a key use case for that in terms of moving.
If you are going to remove anything version, you do want to replace that with equal in terms of that type of performance. They're getting a lot of benefit often in terms of through that full period, and the residual life actually is quite reasonable in things that they have. So not yet are we actually seeing a lot of changes in there. We are seeing Hoppers up and running quite well. And it will be a question later based on the size of it, of whether or not they want to change that out for a new power option for them.
Okay. Very good. Couple of financial questions real quick, if I could. I'd remiss if I didn't ask you. You reiterated a gross margin target of getting back to the mid-70% range by the end of this year. What's driving this? And how should we think about your gross margins more generally as you navigate through different product transitions in the future?
Okay. So we did take an action to -- we knew going into a new type of build as we did with our Blackwell architecture in a full scale, that we had to work through building out full data center scale. And we have now have that running quite smoothly and quite smoothly with GB300 and the Ultra version.
That allows us now to continue to focus our work in terms of cycle time, the amount of time to market, less time, lower cost and also in terms of a full mix of all the different offerings that we have can also improve our gross margins. We're on track to that. You saw us increase in terms of Q2 and in terms of our outlook for Q3. So we're right in line of moving to Q4 to that case.
Now in terms of future, as you remember, our whole focus in terms of pricing and how we think about it is the total TCO. How do we want to make sure we can provide customers the best TCO experience of anything else they could ever consider and that will come into a factor. And then we're already back on those types of systems, and we'll incorporate a gross margin at that time.
Yes. Okay. And then I think it's fair from a capital allocation perspective to conclude, you generate a lot of cash. Talk about -- thoughts about your capital allocation strategy. But specifically, is large-scale M&A a possibility for NVIDIA at this point in time or not?
So let's first start on what is our most important focus on when we think about that capital, we think about that cash? Leveraging that in the most strategic way of our ecosystem is where we want to go, helping those early in their work that they're doing, how can we infuse capital to actually help in various forms of investments. Yes, we do have M&A folks that engineering may be bought here that actually helps us, both whether that be from a software perspective or new techniques that we want to build into our infrastructure.
Never going to say that M&A is not a possible, the size of them, it just depends. It's hard to think about, is that always going to be a perfect match to us. Is there that perfect company out there of a very large size. We are quite fortunate with Mellanox, probably and nearly is the best acquisition on the planet that ever happened.
And sure, we'd love to have a twin or something like that, but it's difficult. So we do focus on our cash being used for the most strategic parts of the ecosystem that we can and using that. That doesn't mean that we will not do a focus on repurchasing our stock. That has been our case possibly just to offset dilution, maybe a little bit more from that. And we always keep our dividend as well.
Very good. I'm going to close out on just two high-level go-forward questions, if I could. One is just if you think about NVIDIA's leadership, Jen-Hsun, yourself, other members of the leadership team, what are your top two to three priorities for the next, say, two years or over that time frame, both for the company or potentially externally?
Surely, if we think about what's running every single day at NVIDIA is getting the next architecture out and moving clearly at that cadence. The next piece of it is not a surprise either. We are going to work on the next cadence after that and focusing on what do we believe is coming around the corner so that we can make sure our infrastructures in every single cadence is right leading at the edge of where the next AI is going. So those are our top pieces.
I think we fall into a situation of an agile company that is more agile than probably any large company for sure, but also any type of small company that we can do and move at that speed has been one of the greatest benefits that we had from a [ global ] leadership and a company culture as a whole.
Very good. And then maybe I think it's fair to say that you've seen your investor conversations and I hear it every day from investors. But obviously, there's a lot going on in the AI space today, a lot of market change, a lot of technology change, and a lot of competitive change. So if you were to serve, say, people get buffeted by these data points almost every single day. So if you were to say here's the top 2, 3, 4 things to really focus on for investors to think about regarding the evolution of AI, what would they be?
Yes. It's interesting that even in a great quarter, the great next quarter in terms of ride there is always this question that says, okay, well, what's next, what's out in the future. And it's a great question to ask because we're on a journey and maybe we're maybe at the one inning, second inning into this journey of going forward because the world needs to transition, as we've discussed. And moving to not just AI solutions, but a different form of how we are doing computing that is accelerated and a parallel computing is needed.
So you can look and say, but I can see the four cloud providers and the work that they're doing. They are doing a tremendous job of being helpful in the early stages of a way to use in the cloud and get started in terms of the AI, but there is so much more that needs to happen and work. But if you recall at the very onset of AI, we're focusing on perception.
We were looking and categorizing different items into categories and saying, look, what we've done, we've been able to categorize. The next pace was focusing on the recommender engines on pieces of that, and a long came generative AI. And everyone said, this is amazing, this is great. But again, it is an important capability that is still absolutely advancing, particularly as we talk about that reasoning and talking about what will move to agentic. You want it to actually get work done. Now don't get me wrong, in the evening by yourself and your lovely model telling you and talking to you all evening answer to all your questions is a great thing.
But it would be even more impressive as we show up them from the work the next day in terms of the amount of work that we'll be able to be accomplished. And it's an important piece because there are so many industries right now that struggle in terms of how am I even going to get the employment to do all of the work that is being needed.
The more that AI can be used in there, the more efficient all things will be. So we have a long way with enterprises to industry by industry for them to transform. You're not going to see that AI take place in a situation that there's a new AI tool system, you already have a full SaaS system of tremendous amount of software opportunities that will be infused with AI solutions to get us there.
So as we see that journey every day, it is another case of new and exciting pieces. We love being in the highlights and seeing what is moving and what they're building on. But let's just say the next models, they're multimodal. Our models right now. They have a tremendous amount of information. They've got a tremendous amount of data. They've got a tremendous amount of numbers and words. But now how do I read PDF? How do I include in terms of video? How do I make them specific for physics or otherwise. There's great more tremendous amount of work as we go forward. And where we focus on is the full journey. We enjoy being that key platform to enable what we'll see in the future, but that's what I think our work is in front of us.
Very good. Well, I think it's fair to say if this is inning 1 or 2, we've got a heck of a ball game ahead of us. Thanks for being here, Colette. We appreciate it.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Goldman Sachs Communacopia + Technology Conference 2025
NVIDIA — Goldman Sachs Communacopia + Technology Conference 2025
📣 Kernbotschaft
- Marktprognose: NVIDIA skizziert ein langfristiges Datenzentrum‑Investment von etwa $3–4 Bio. bis Ende des Jahrzehnts und sieht das als Plattformwechsel hin zu beschleunigtem (parallelem) Computing.
- Produkt‑Cadence: Die jährliche Produkt‑Cadence (1‑Jahres‑Rhythmus) und die neuen Blackwell/GB300‑Familien treiben kontinuierliche Leistungs‑ und Effizienzsteigerungen voran.
🎯 Strategische Highlights
- Nachfrage: Data‑Center wuchsen im Q2 sequenziell ~12%; Management peilt für Q3 eine sequenzielle Steigerung von ~17% an (ohne China‑Effekt).
- Netzwerk & NVLink: Starkes Wachstum bei Ethernet‑für‑AI und InfiniBand; 5. Generation NVLink ermöglicht Rack‑Scale (bis ~72 GPUs) und NVLink Fusion öffnet die Plattform für Dritt‑Chips.
- China/ H20: Lizenzen für mehrere chinesische Kunden bestätigt, Versand hängt aber weiter an geopolitischer Klärung; Upside geschätzt ~$2–5 Mrd. möglich.
🔭 Neue Informationen
- Produktstatus: GB200/GB300 Ultra sind in großem Volumen am Markt; Vera Rubin: sechs Chips getaped‑out, befinden sich in Reifung; Management erwartet bereits Gigawatt‑Bedarf.
- Margen‑Ziel: Rückkehr zu Bruttomargen in den mittleren 70% bis Jahresende bleibt Ziel; Treiber sind Mix, Zykluszeitverkürzung und Skaleneffekte.
❓ Fragen der Analysten
- Wettbewerb: Diskussion über ASICs versus GPUs – Management betont Leistungs‑, Watt‑ und Dollar‑Effizienz für Training und Inferenz; Differenzierung durch NVLink/Rack‑Scale.
- Modelldiversität: Nachfrage für kleinere Modelle vs. „AI‑Factories“: NVIDIA erwartet integrierte Infrastruktur statt separater kleiner Installationen.
- Offene Punkte: Timing und Volumen der H20‑Lieferungen nach China, konkrete Rubin‑Leistungszahlen und Umfang großer M&A blieben vage.
⚡ Bottom Line
- Implikation: Signal an Aktionäre: starke, breit getragene Nachfrage, klare Produktroadmap und strukturelle Moats (NVLink, Netzwerke). Wichtige Upside‑Faktoren sind China‑Shipments und Rubin‑Ramp; Risiken bleiben geopolitische Unsicherheit und Execution bei Gigawatt‑Skalierung.
NVIDIA — Q2 2026 Earnings Call
1. Management Discussion
Good afternoon. My name is Sarah, and I will be your conference operator today. At this time, I would like to welcome everyone to NVIDIA's Second Quarter Fiscal 2026 Financial Results Conference Call. [Operator Instructions] Toshiya Hari, you may begin your conference.
Thank you. Good afternoon, everyone, and welcome to NVIDIA's conference call for the second quarter of fiscal 2020. With me today from NVIDIA are Jensen Huang, President and Chief Executive Officer; and Colette Kress, Executive Vice President and Chief Financial Officer.
I'd like to remind you that our call is being webcast live on NVIDIA's Investor Relations website. The webcast will be available for replay until the conference call to discuss our financial results for the third quarter of fiscal 2026. The content of today's call is NVIDIA's property. It can't be reproduced or transcribed without our prior written consent.
During this call, we may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties, and our actual results may differ materially. For a discussion of factors that could affect our future financial results and business, please refer to the disclosure in today's earnings release, our most recent Forms 10-K and 10-Q and the reports that we may file on Form 8-K with the Securities and Exchange Commission. All our statements are made as of today, August 27, 2025, based on information currently available to us.
Except as required by law, we assume no obligation to update any such statements. During this call, we will discuss non-GAAP financial measures. You can find a reconciliation of these non-GAAP financial measures to GAAP financial measures in our CFO commentary, which is posted on our website. With that, let me turn the call over to Colette.
Thank you, Toshiya. We delivered another record quarter while navigating what continues to be a dynamic external environment. Total revenue was $46.7 billion, exceeded our outlook as we grew sequentially across all market platforms. Data center revenue grew 56% year-over-year. Data center revenue also grew sequentially despite the $4 billion decline in H20 revenue. NVIDIA's Blackwell top can reached record levels, growing sequentially by 17%. We began production shipments of GB300 in Q2. Our full stack AI solutions for cloud service providers, neo clouds, enterprises and sovereigns are all contributing to our growth.
We are at the beginning of an industrial revolution that will transform every industry. We see $3 billion to $4 trillion in AI infrastructure spend in the -- by the end of the decade. The scale and scope of these build-outs presents significant long-term growth opportunities for NVIDIA. The GB200 NBL system is seeing widespread adoption with deployments at CSPs and consumer Internet companies. Lighthouse model builders, including OpenAI, Meta and Mistral are using the GB200 NVL72 and at data center scale for both training, next-generation models and serving inference models in production.
The new Blackwell Ultra platform has also had a strong quarter, generating tens of billions in revenue. The transition to the GB300 has been seamless for major cloud service providers due to its shared architecture, software and physical footprint with the GB200, enabling them to build and deploy GB300 racks with ease. The transition to the new GB300 rack-based architecture has been seamless. Factory builds in late July and early August were successfully converted to support the GB300 ramp and today, full production is underway. The current run rate is back at full speed, producing approximately 1,000 racks per week. This output is expected to accelerate even further throughout the third quarter as additional capacity comes online.
We expect widespread market availability in the second half of the year as CoreWeave prepares to bring their GB300 instance to market as they are already seeing 10x more inference performance on reasoning models compared to H100. Compared to the previous hopper generation, GB300 NVL72 AI factories promise 10x improvement in token per watt energy efficiency, which translates to revenues as data centers are power limited.
The chips of the Rubin platform are in fab, the Vera CPU, Rubin GPU, CX9 SuperNIC, NVLink 144 Scale Up switch Spectrum-X, scale out and scale across switch and the silicon photonics processor. Rubin remains on schedule for volume production next year. Rubin will be our third-generation NVLink rack scale AI supercomputer with a mature and full-scale supply chain. This keeps us on track with our pace of an annual product cadence and continuous innovation across compute, networking, systems and software.
In late July, the U.S. government began reviewing licenses for sales of H20 to China customers. While a select number of our China-based customers have received licenses over the past few weeks, we have notched any H20 based on those licenses. USG officials have expressed an expectation that the USG will receive 15% of the revenue generated from licensed H20 sales. But to date, the USG has not published a regulation codifying such requirement. We have not included H20 in our Q3 outlook as we continue to work through geopolitical issues. If geopolitical issues reside, we should ship $2 billion to $5 billion in H20 revenue in Q3. And if we had more orders, we can bill more. We continue to advocate for the U.S. government to approve Blackwell for China. Our products are designed and sold for beneficial commercial use and every license sale we make will benefit the U.S. economy, the U.S. leadership.
In highly competitive markets, we want to win the support of every developer. America's AI technology stock can be the world's standard if we raise and compete globally. Notably, in the quarter was an increase in Hopper 100 and H200 shipments. We also sold approximately $650 million of H20 in Q2 to an unrestricted customer outside of China. The sequential increase in Hopper demand indicates the breadth of data center workloads that run on accelerated computing and the power of CUDA libraries and full stack optimizations, which continuously enhance the performance and economic value of our platform.
As we continue to deliver both Hopper and Blackwell GPUs, we are focusing on meeting the soaring global demand. This growth is fueled by capital expenditures from the cloud to enterprises, which are on track to invest $600 billion in data center infrastructure and compute this calendar year alone, nearly doubling in 2 years. We expect annual AI infrastructure investments to continue growing, driven by the several factors. Reasoning agentic AI requiring orders of magnitude more training and inference compute. Global build-outs for sovereign AI enterprise AI adoption and the arrival of physical AI and robotics.
Blackwell has set the benchmark as it is the new standard for AI inference performance. The market for AI inference is expanding rapidly with reasoning and agentic AI gaining traction across industries. Blackwell's rack scale NVLink and CUDA full stack architecture addresses this by redefining the economics of inference. New NV FP4-bit precision and NVLink 72 on the GB300 platform delivers a 50x increase in energy efficiency per token compared to Hopper, enabling companies to monetize their compute at unprecedented scale. For instance, a $3 million investment in GB200 infrastructure can generate $30 million in token revenue a 10x return.
NVIDIA software innovation, combined with the strength of our developer ecosystem has already improved Blackwell's performance by more than 2x since its launch. Advances in CUDA, TensorRT-LLM and Dynamo are unlocking maximum efficiency. CUDA library contributions from the open source community, along with NVIDIA's open libraries and frameworks are now integrated into millions of workflows. This powerful flywheel of collaborative innovation between NVIDIA and global community contribution strengthens NVIDIA's performance leadership.
NVIDIA is a top contributor to OpenAI models, data and software. Blackwell has introduced a groundbreaking numerical approach to large language model pretraining. Using NV FP4, computations on the GB300 can now achieve 7x faster training than the H100, which uses FP8. This innovation delivers the accuracy of 16-bit precision with the speed and efficiency of 4 bit, setting a new standard for AI factor efficiency and scalability. The AI industry is quickly adopting this revolutionary technology with major players such as AWS, Google Cloud, Microsoft Azure and OpenAI as well as Coker, Mistral, Kimi, perplexity, reflection and runway, already embracing it.
NVIDIA's performance leadership was further validated in the latest MLPerf training benchmarks, where the GB200 delivered a clean sweep beyond the lookout for the upcoming MLPerf inference results in September, which will include benchmarks based on the Blackwell Ultra. NVIDIA RTX Pro servers are in full production for the world system makers. These are air-cooled PCIe-based systems integrated seamlessly into standard IT environments and run traditional enterprise IT applications as well as the most advanced agentic and physical AI applications. Nearly 90 companies, including many global leaders are already adopting RTX Pro servers. Hitachi uses them for real-time simulation and digital twins, Lilly for drug discovery, Hyundai for factory design and AV validation and Disney for immersive story telling. As enterprises modernize data centers, RTX Pro servers are poised to become a multibillion dollar product line.
Sovereign AI is 1 on the rise as the nation's ability to develop its own AI using domestic infrastructure, data and talent presents a significant opportunity for NVIDIA. NVIDIA is at the forefront of landmark initiatives across the U.K. and Europe. The European Union plans to invest EUR 20 billion to establish 20 AI factories across France, Germany, Italy and Spain, including 5 giga factories to increase its AI compute infrastructure by tenfold.
In the U.K., the AI supercomputer powered by NVIDIA was unveiled at the country's most powerful AI system, delivering of AI performance to accelerate breakthroughs in fields of drug discovery and climate modeling. We are on track to achieve over $20 billion in Sovereign AI revenue this year, more than double than that last year.
Networking delivered record revenue of $7.3 billion, and escalating demands of AI compute clusters necessitate high efficiency and low latency networking. This represents a 46% sequential and 98% year-on-year increase with strong demand across Spectrum-X Ethernet, InfiniBand and NVLink. Our Spectrum-X enhanced Ethernet solutions provide the highest throughput and lowest latency network for Ethernet AI workloads. Spectrum-X Ethernet delivered double-digit sequential and year-over-year growth with annualized revenue exceeding $10 billion. At hot chips, we introduced Spectrum XGS Ethernet, a technology design to unify disparate data centers into giga-scale AI super factories, is an initial adopter of the solution, which is projected to double GPU to GPU communication speed.
InfiniBand revenue nearly doubled sequentially, fueled by the adoption of XDR technology, which provides double the bandwidth improvement over its predecessor, especially valuable for the model builders. World's fastest switch NVLink with 14x the bandwidth of PCIe Gen 5 delivered strong growth as customers deployed Blackwell NVLink rack-scale systems. The positive reception to NVLink Fusion, which allows semi-custom AI infrastructure has been widespread. Japan's upcoming Fugaku Next will integrate Fujitsu's CPUs with our architecture via NVLink fusion. It will run a range of workloads, including AI, supercomputing and quantum computing. Fugaku Next joins a rapidly expanding list of leading quantum supercomputing and research centers running on NVIDIA's CUDA Q Quantum platform, including ULC, AIST, NNS and supported by over 300 ecosystem partners, including AWS, Google Quantum AI, quantinuum, Q-era and SI Quantum.
Just in for our new robotics computing platform is now available or delivers an order of magnitude, greater AI performance and energy efficiency than NVIDIA AGX Orin. It runs the latest generative and reasoning AI models at the edge in real time, enabling state-of-the-art robotics. Adoption of NVIDIA's robotics full stack platform is growing at rapid rate. Over 2 million developers and 1,000-plus hardware software applications and sensor partners taking our platform to market. Leading enterprises across industries have adopted 4, including Agility Robotics, Amazon Robotics, Boston Dynamics, Caterpillar, Figure, Hexagon, Medtronic and Meta.
Robotic applications require exponentially more compute on the device and in infrastructure, representing a significant long-term demand driver for our data center platform. NVIDIA Omniverse with Cosmos is our data center physical AI digital trim platform built for development of robot and robotic systems. This quarter, we announced a major expansion of our partnership with Siemens to enable AI, automatic factories, leading European robotics companies, including Agile Robots, Neuro Robotics and Universal Robots are building their latest innovations with the Omniverse platform.
Transitioning to a quick summary of our revenue by geography. China declined on a sequential basis to low single-digit percentage of data center revenue. Note, our Q3 outlook does not include H20 shipments to China customers. Singapore revenue represented 22% of second quarter's billed revenue as customers have centralized their invoicing in Singapore. Over 99% of data center compute revenue billed to Singapore was for U.S.-based customers. Our gaming revenue was a record $4.3 billion, a 14% sequential increase and a 49% jump year-on-year. This was driven by the ramp of Blackwell GeForce GPUs and as strong sales continued as we increase supply availability.
This quarter, we shipped GeForce RTX 50-60 desktop GPO. It brings double the performance along with advanced ray tracing, neuro rendering and AI-powered DLSS 4 gameplay to millions of gamers worldwide. Blackwell is coming to GeForce now in September. This is GeForce Now's most significant upgrade offering RTX 50-80 cost performance, minimal latency and 500 resolution at 120 frames per second. We are also doubling the GeForce Now catalog to over 4,500 titles, the largest library of any cloud gaming service.
For AI enthusiasts, On-device AI performs the best RTX GPUs. We partnered with OpenAI to optimize their open source GPT models for high-quality fast and efficient inference on millions of RTX-enabled window devices. With the RTX platform stack, window developers can create AI applications designed to run on the world's largest AI PC user base.
Professional Visualization revenue reached $601 million, a 32% year-on-year increase. Growth was driven by an adoption of the high-end RTX Workstation GPUs and AI-powered workload like design, simulation and prototyping. Key customers are leveraging our solutions to accelerate their operations. Activision Blizzard uses RTX workstations to enhance creative workflows. While robotics innovator, Figure AI powers its humanoid robots with RTX embedded GPUs.
Automotive revenue, which includes only in-car compute revenue was $586 million, up 69% year-on-year, primarily driven by self-driving solutions. We have begun shipments of NVIDIA Thor SoC, the successor to Oren. Thor's arrival coincides with the industry's accelerating shift to vision language model architecture generative AI and higher levels of autonomy. Thor is the most successful robotics and AV computer we've ever created, Thor power. Our full stack Drive AV software platform is now in production, opening up billions to new revenue opportunities for NVIDIA while improving vehicle safety and autonomy.
Now moving to the rest of our P&L. GAAP gross margin was 72.4% and a non-GAAP gross margin was 72.7%. These figures include a $180 million or 40 basis point benefit from releasing previously reserved H20 inventory. Excluding this benefit, non-GAAP gross margins would have been 72.3%, still exceeding our outlook. GAAP operating expenses rose 8% and 6% on a non-GAAP basis sequentially. This increase was driven by higher compute and infrastructure costs as well as higher compensation and benefit costs. To support the ramp of Blackwell and Blackwell Ultra, inventory increased sequentially from $11 billion to $15 billion in Q2.
While we prioritize funding our growth and strategic initiatives, in Q2, we returned $10 billion to shareholders through share repurchases and cash dividends. Our Board of Directors recently approved a $60 million share repurchase authorization to add to our remaining $14.7 billion of authorization at the end of Q2.
Okay. Let me turn it to the outlook for the third quarter. Total revenue is expected to be $54 billion, plus or minus 2%. This represents over $7 billion in sequential growth. Again, we do not assume any H20 shipments to China customers in our outlook. GAAP and non-GAAP gross margins are expected to be 73.3%, 73.5%, respectively, plus or minus 50 basis points. We continue to expect to exit the year with non-GAAP gross margins in the mid-70s. GAAP and non-GAAP operating expenses are expected to be approximately $5.9 billion and $4.2 billion, respectively.
For the full year, we expect operating expenses to grow in the high 30s range year-over-year, up from our prior expectations of the mid-30s. We are accelerating investments in the business to address the magnitude of growth opportunities that lie ahead. GAAP and non-GAAP other income and expenses are expected to be an income of approximately $500 million, excluding gains and losses from nonmarketable and public held equity securities. GAAP and non-GAAP tax rates are expected to be 16.5%, plus or minus 1%, excluding any discrete items.
Further financial data are included in the CFO commentary and other information available on our website. In closing, let me highlight upcoming events for the financial community. We will be at the Goldman Sachs Technology Conference on September 8 in San Francisco. Our annual NDR will commence the first part of October. DTC data center begins on October 27, with Jensen's keynote scheduled for the 28. We look forward to seeing you at these events. Our earnings call to discuss the results of our third quarter of fiscal 2026 is scheduled for November 19. We will now open the call for questions.
Operator, would you please poll for questions?
[Operator Instructions] Your first question comes from CJ Muse with Cantor Fitzgerald.
2. Question Answer
I guess with wafer into rack out lead times of 12 months you confirmed on the call today that Rubin is on track for the second half. And obviously, many of these investments are multiyear projects contingent upon power cooling, et cetera. I was hoping perhaps could you take a high-level view and speak to your vision for growth into 2026. And as part of that, if you can kind of comment between network and data center would be very helpful.
Yes. Thanks, CJ. At the highest level of growth drivers would be the evolution, the introduction, if you will, of reasoning agentic genetic AI. Where chatbots used shot, you give it a prompt and it would generate the answer. Now the AI does research. It thinks and does a plan, and it might use tools. And so it's called long thinking and the longer it thinks, oftentimes, it produces better answers. And the amount of computation necessary for 1 shot versus reasoning agentic AI models could be 100x, 1,000x and potentially even more as the amount of research and basically reading and comprehension that it goes off to do.
And so the amount of computation that has resulted in agentic AI has grown tremendously. And of course, the effectiveness has also grown tremendously. Because of agentic AI, the amount of hallucination has dropped significantly. You can now use -- you can now use tools and perform tasks. Enterprises have been opened up. As a result of agentic AI and vision language models, we now are seeing a breakthrough in physical AI and robotics, autonomous systems. So the last year, AI has made tremendous progress and agentic systems, reasoning systems is completely revolutionary.
Now we built the Blackwell NVLink 72 system, a rack scale computing system for this moment. We've been working on it for several years. This last year, we transitioned from NVLink 8, which is a node scale computing, each node is a computer to now NVLink 72, where each rack is a computer. That disaggregation of NVLink 72 into a rack scale system was extremely hard to do, but the results are extraordinary. We're seeing orders of magnitude speed up and therefore, energy efficiency and therefore, cost effectiveness of token generation because of NVLink 72.
And so over the next a couple of years, you're going to -- well, you asked about longer term, over the next 5 years, we're going to scale into with Blackwell, with Rubin and follow-ons to scale into effectively a $3 billion to $4 trillion AI infrastructure opportunity. The last couple of years, you have seen that CapEx has grown in just the top 4 CSPs and by -- has doubled and grown to about $600 billion. So we're in the beginning of this build-out, and the AI technology advances has really enabled AI to be able to adopt and solve problems to many different industries.
Your next question comes from Vivek Arya with Bank of America Securities.
Colette, I just wanted to clarify, $2 billion to $5 billion in China, what needs to happen? And what is the sustainable pace of that China business as you get into Q4. And then, Jensen, for you on the competitive landscape, several of your large customers already have or are planning many ASIC projects. I think 1 of your ASIC competitors, Broadcom signal that they could grow their AI business almost 55%, 60% next year. Any scenario in which you see the market moving more towards ASICs and away from NVIDIA GPU. Just what are you hearing from your customers, how are they managing this split between the use of merchant silicon and ASIC.
Thanks, Vivek. So let me first answer your question regarding what will it take for the 20s to be shaft. There is interest in our 20s. There is the initial set of license that we received. And then additionally, we do have supply that we are ready, and that's why we communicated that somewhere in the range of about $2 billion to $5 billion this quarter we could potentially ship.
We're still waiting on several of the geopolitical issues going back and forth between the governments and the companies trying to determine their purchases and what they want to do. So it's still open at this time, and we're not exactly sure what that full amount will be this quarter. However, if more interest arrives, more licenses arrives, again, we can also still build additional H20 and ship more as well.
NVIDIA Builds very different things in ASICs. So let's talk about ASIC first. A lot of projects are started, many start-up companies are created very few products go into production. And the reason for that is it's really hard. Accelerated computing is unlike general-purpose computing. You don't write software and just compile it into a processor. Accelerated computing is a full stack co-design problem. And AI factories in the last several years have become so much more complex because of the scale of the problems have grown so significantly. It is really the ultimate, the most extreme computer science problem the world's ever seen, obviously.
And so the stack is complicated. The models are changing incredibly fast from generative based on auto regressive to degenerative based on diffusion to mixed models to multi-modality. The number of different models that are coming out that are either derivatives of transformers or evolutions of transformers is just daunting. One of the advantages that we have is that NVIDIA is available in every cloud. We're available from every computer company. We're available from the cloud to on-prem to edge to robotics on the same programming model. And so it's sensible that every framework in the world supports NVIDIA. When you're building a new model architecture, releasing it on NVIDIA is most sensible. And so the diversity of our platform, both in the ability to evolve into any architecture, the fact that we're everywhere. And also, we accelerate the entire pipeline.
Everything from data processing to pretraining to post training with reinforcement learning, all the way out to inference. And so when you build a data center with NVIDIA platform in it, the utility of it is best. The lifetime usefulness is much, much longer. And then I would just say that in addition to all of that, and it's just a really extremely complex systems problem anymore. People talk about the chip itself. There's 1 ASIC, the GPU that many people talk about. But in order to build Blackwell, the platform and Rubin the platform, we had to build CPUs that connect fast memory low -- extremely energy-efficient memory for large KB cashing necessary for a genic AI to the GPU to a SuperNIC to a scale-up switch, we call NVLink completely revolutionary. We're in our fifth generation now to a scale-out switch, whether it's Quantum or Spectrum-X Ethernet, to now scale across switches so that we could prepare for these AI super factories with multiple gigawatts of computing, all connected together. We call that spectrum XGS, we just announced that at hot chips this week.
And so the complications, the complexity of everything that we do is really quite extraordinary. It's just done at a really, really extreme scale now. And then lastly, if I could just say 1 more thing. We're in every cloud for a good reason. Not only are we the most energy efficient our Perf per watt is the best of any computing platform. And in a world of power limited data centers, perf per watt drives directly to revenues. And you've heard me say before that in a lot of ways, the more you buy, the more you grow. And because our perf per dollar, the performance per dollar is so incredible, you also have extremely great margins. So the growth opportunity with NVIDIA's architecture and the gross margins opportunity with NVIDIA's architecture is absolutely the best. And so there's a lot of reasons why NVIDIA has chosen by every cloud and every startup and every computer company, we're really a holistic full stack solution for AI factories.
Your next question comes from Ben Reitzes with Melius.
Jensen, I wanted to ask you about your $3 billion to $4 trillion in data center infrastructure spend by the end of the decade. Previously, you talked about something in the $1 billion range, which I believe was just for compute by 2028. If you take past comments, $3 billion to $4 trillion would imply maybe $2 billion plus in compute spend. And just wanted to know if that was right, and that's what you're seeing by the end of the decade. And wondering what you think your share will be of that. Your share right now of total infrastructure compute wise is very high. So I wanted to see. And also if there's any bottlenecks you're concerned about like power to get to the $3 billion to $4 trillion.
As you know, the CapEx of just the top 4 hyperscalers has doubled in 2 years. As the AI revolution went into full steam, as the AI race is now on, the CapEx spend has doubled to $600 billion per year. There's 5 years between now and the end of the decade and $600 billion only represents the top 4 hyperscalers. We still have the rest of the enterprise companies building on-prem. You have cloud service providers building around the world. United States represents about 60% of the world's compute and over time, you would think that artificial intelligence would reflect GDP scale and growth. And so -- and we'll be, of course, accelerating GDP growth.
And so our contribution to that is a large part of the AI infrastructure. Out of a gigawatt AI factory, which can go anywhere from 50 to plus or minus 10%, let's say, $50 billion to $60 billion, we represent about 35% plus or minus of that and $35 billion out of $50 billion per gigawatt data center. And of course, what you get for that is not a GPU. I think people we're famous for building the GPU and inventing the GPU. But as you know, over the last decade, we've really transitioned to become an AI infrastructure company. It takes 6 chips just to build 6 different types of chips just to build an Rubin AI supercomputer. And just to scale that out, to a gigawatt, you have hundreds of thousands of GPU compute nodes and a whole bunch of racks. And so we're really an AI infrastructure company. And we're hoping to continue to contribute to growing this industry, making AI more useful and then very importantly, driving the performance per watt because the world, as you mentioned, limiters, it will always likely be power limitations or building limitations.
And so we need to squeeze as much out of that factory as possible. NVIDIA's performance per unit of energy used drives the revenue growth of that factory. It directly translates. If you have a 100-megawatt factory, perf per 100-megawatt drives your revenues. It's tokens per 100 megawatts of factory. In our case, also, the performance per dollar spent is so high that your gross margins are also the best. But anyhow, these are the limiters going forward and $3 trillion to $4 trillion is fairly sensible for the next 5 years.
Next question comes from Joe Moore of Morgan Stanley..
Great, congratulations on reopening the China opportunity. Can you talk about the long-term prospects there? You've talked about, I think, half of the AI software world being there. How much can NVIDIA grow in that business? And how important is it that you get the black architecture ultimately license there?
The China market, I've estimated to be about $50 billion of opportunity for us this year. If we were able to address it with competitive products. And if it's $50 billion this year, you would expect it to grow, say, 50% per year. As the rest of the world's AI market is growing as well. It is the second largest computing market in the world, and it is also the home of AI researchers. About 50% of the world's AI researchers are in China. The vast majority of the leading open source models are created in China. And so it's fairly important, I think, for the American technology companies to be able to address that market. And open source, as you know, is created in 1 country, but it's used all over the world.
The open source models that have come out of China are really excellent. DeepSeek, of course, gained global notoriety. Q1 is excellent. Kemi's excellent. There's a whole bunch of new models that are coming out. They're multimodal, they're great language models. And it's really fueled the adoption of AI in enterprises around the world because enterprises want to build their own custom proprietary software stacks. And so open source model is really important for enterprise. It's really important for SaaS who also would like to build proprietary systems.
It has been really incredible for robotics around the world. And so open source is really important. And it's important that the American companies are able to address it. This is -- it's going to be a very large market. We're talking to the administration about the importance of American companies to be able to address the Chinese market. And as you know, H20 has been approved for companies that are not on the entities list, and many licenses have been approved. And so I think the opportunity for us to bring Blackwell to the China market is a real possibility. And so we just have to keep advocating the sensibility of and the importance of American tech companies to be able to lead and win the AI race and help make the American tech stack the global standard.
Your next question comes from the line of Aaron Rakers with Wells Fargo.
Yes. Thank you for the question. I want to go back to the Spectrum-X GS announcement this week. And thinking about the Ethernet product now pushing over $10 billion of annualized revenue. Justin, how -- what is the opportunity set that you see for Spectrum-X GS, do we think about this as kind of the data center interconnect layer. Any thoughts on the sizing of this opportunity within that Ethernet portfolio?
We now offer 3 networking technologies. One is for scale up. One is for scale-out and 1 for scale across. Scale-up is so that we could build the largest possible virtual GPU, the virtual compute node. NVLink is revolutionary, NVLink 72 is what made it possible for Blackwell to deliver such an extraordinary generational jump over hoppers NVLink 8 at a time when we have long thinking models, agentic AI reasoning systems, the NVLink basically amplifies the memory bandwidth, which is really critical for reasoning systems. And so NVLink 72 is fantastic.
We then scale out with networking, which we have 2, we have InfiniBand, which is unquestionably the lowest latency, the lowest jitter, the best scale-out network. It does require more expertise in managing those networks and for supercomputing for the leading model makers. InfiniBand, Quantum InfiniBand is the unambiguous choice. If you were to benchmark an AI factory, the ones with InfiniBand are the best performance. For those who would like to use Ethernet because their whole data center is built with Ethernet, we have a new type of Ethernet called Spectrum Ethernet. Spectrum Ethernet is not off the shelf. It has a whole bunch of new technologies designed for low latency and low jitter and congestion control. And it has the ability to come closer, much, much closer to InfiniBand than anything that's out there, that we call that Spectrum-X Ethernet.
And then finally, we have Spectrum-X GS, a giga-scale for connecting multiple data centers, multiple AI factories into a super factory, a gigantic system. And we're going to -- you're going to see that networking obviously is very important in AI factories. In fact, choosing the right networking, the performance, the throughput improvement, going from 65% to 85% or 90%, that kind of step-up because of your networking capability effectively makes networking free. Choosing the right networking, you're basically paying -- you'll get a return on it like you can't believe because the AI factory, a gigawatt, as I mentioned before, could be $50 billion. And so the ability to improve the efficiency of that factory by tens of percent is -- results in $10 billion, $20 billion worth of effective benefit. And so this -- the networking is a very important part of it.
It's the reason why NVIDIA dedicate so much in networking. That's the reason why we purchased Mellanox 5.5 years ago. And Spectrum-X, as we mentioned earlier, is now quite a sizable business, and it's only about 1.5 years old. So Spectrum-X is a home run. All 3 of them are going to be fantastic. NVLink scale-up Spectrum X and InfiniBand scale out and then Spectrum-X GS for scale across.
Your next question comes from Stacy Rasgon with Bernstein Research.
I have more tactical question for Colette. So on the guidance, we're up over $7 billion, the vast bulk of that is going to be from data center. How do I think about apportioning that $7 billion across Blackwell versus Hopper versus networking? I mean it looks like Blackwell was probably $27 billion in the quarter, up from maybe $23 billion last quarter. Hopper is still $6 billion or $7 billion post the H20. Like do you think the hopper strength continues? Just how do I think about parsing that $7 billion out across the orient components?
Thanks, Stacy, for the question. First part of it, looking at our growth between Q2 and Q3, Blackwell is still going to be the lion's share of what we have in terms of data center. But keep in mind, that helps both our compute side as well as it helps our networking side because we are selling those significant systems that are incorporating the NVLink that Jensen just spoke about. Selling hopper, we are still selling it. H100, H200s, we are again, they are HGX systems, and they still believe our Blackwell will be the line share of what we're doing on there. So we'll continue. We don't have any more specific details in terms of how we'll finish our quarter, but you should expect Blackwell again to be the driver of the growth.
Your next question comes from Jim Schneider of Goldman Sachs.
You've been very clear about the reasoning model opportunity that you see, and you've also been relatively clear about technical specs for Rubin, but maybe you could provide a little bit of context about how you view the Rubin product transition going forward. What incremental capability does that offer to customers? And would you say that Rubin is a bigger, smaller or similar step up in terms of performance from a capability perspective relative to what we saw with Blackwell.
Rubin. Rubin, we're on an annual cycle. And the reason why we're on an annual cycle is because we can do so to accelerate the cost reduction and maximize the revenue generation for our customers. When we increase the perf per watt, the token generation per amount of usage of energy, we are effectively driving the revenues of our customers. The perf per watt of Blackwell will be for reasoning systems in order of magnitude higher than Hopper. And so for the same amount of energy and everybody's data center is energy limited by definition, for any data center, using Blackwell, you'll be able to maximize your revenues compared to anything we've done in the past compared to anything in the world today. And because the perf per dollar, the performance is so good that the perf per dollar invested in the capital would also allow you to improve your gross margins.
To the extent that we have great ideas for every single generation, we could improve the revenue generation, improve the AI capability, improve the margins of our customers by releasing new architectures. And so we advise our partners, our customers to pace themselves and to build these data centers on an annual rhythm. And Rubin is going to have a whole bunch of new ideas. I will pause for a second because I've got plenty of time between now and a year from now to tell you about all the breakthroughs that Rubin is going to bring. But Rubin has a lot of great ideas.
I'm anxious to tell you but I can't right now. And I'll save it for GTC to tell you more and more about it. But nonetheless, for the next year, we're ramping really hard into now Grace Blackwell, GB200, and then now Blackwell Ultra GB300, we're ramping really hard into data centers. This year is obviously a record-breaking year I expect next year to be a record-breaking year. And while we continue to increase the performance of AI capabilities as we race towards artificial super intelligence on the one hand and continue to increase the revenue generation capabilities of our hyperscalers on the other hand.
Your final question comes from Timothy Arcuri with UBS. .
Jensen, I wanted to ask you just answered the question. You threw at a number, you said 50% CAGR for the AI market. So I'm wondering how much visibility that you have into next year, is that kind of a reasonable bogey in terms of how much your data center revenue should grow next year? I would think you'll grow at least in line with that CAGR? And maybe are there any puts and takes to that?
Well, I think the best way to look at it is we have reasonable forecasts from our large customers for next year. a very, very significant forecast. And we still have a lot of businesses that we're still winning and a lot of start-ups that are still being created. Don't forget that the number of startups for AI is was $100 billion was funded last year. This year, the year is not even over yet, it's $180 billion funded. If you look at AI native the top AI native start-ups that are generating revenues last year was $2 billion. This year, it's $20 billion, next year being 10x higher than this year is not inconceivable. And the open source models is now opening up large enterprises, SaaS companies, industrial companies, robotics companies to now join the AI revolution, another source of growth. And whether it's AI natives or enterprise SaaS or industrial AI or start-ups, we're just seeing just enormous amount of interest in AI and demand for AI.
Right now, the buzz is, I'm sure all of you know about the buzz out there. The buzz is everything sold out. H100 sold out. H200s are sold out. Large CSPs are coming out renting capacity from other CSPs. And so the AI native start-ups are really scrambling to get capacity so that they could train their reasoning models. And so the demand is really, really high.
But the long-term outlook between where we are today, CapEx has doubled in 2 years. It is now running about $600 billion a year just in the large hyperscalers for us to grow into that $600 billion a year, representing a significant part of that CapEx isn't unreasonable. And so I think the next several years, surely through the through the decade, we see just a really fast growing, really significant growth opportunities ahead.
Let me conclude with this. Blackwell is the next-generation AI platform the world has been waiting for. It delivers an exceptional generational leap. NVIDIA's NVLink 72, rack scale computing is revolutionary, arriving just in time as reasoning AI models drive order of magnitude increases in training and inference performance requirement. Blackwell e Ultra is ramping at full speed and the demand is extraordinary. Our next platform Rubin, is already in fab. We have 6 new chips that represents the Rubin platform. They have all take out the TSMC. Rubin will be our third-generation NVLink rack scale AI supercomputer. And so we expect to have a much more mature and fully scaled up supply chain. Blackwell and Rubin AI factory platforms will be scaling into the $3 billion to $4 trillion global AI factory build out through the end of the decade.
Customers are building ever greater scale AI factories from thousands of Hopper GPUs in tens of megawatt data centers to now hundreds of thousands of Blackwells in 100-megawatt facilities. And soon, we'll be building millions of Rubin GPU platforms, powering multi-gigawatt multisite AI super factories. With each generation demand only grows, 1 shot chat bots have evolved into reasoning a genetic AI that research, plan and use tools, driving orders of magnitude jump in compute for both training and inference. Agentic AI is reaching maturity and has opened the enterprise market to build domain and company-specific AI agents for enterprise workflows, products and services.
The age of physical AI has arrived, unlocking entirely new industries in robotics, industrial automation. Every industrial company will need to build 2 factories: one, to build the machines and another to build their robotic AI. This quarter, NVIDIA reached record revenues and an extraordinary milestone in our journey. The opportunity ahead is immense. A new industrial revolution has started. The AI races on. Thanks for joining us today, and I look forward to addressing you next week -- next earnings call. Thank you.
This concludes today's conference call. You may now disconnect.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Q2 2026 Earnings Call
NVIDIA — Q2 2026 Earnings Call
📊 Quartal auf einen Blick
- Umsatz: $46,7 Mrd. (Rekordquartal; Gesamtwachstum getrieben von Data Center)
- Data Center: +56% YoY; GB300-Produktion gestartet; H2O‑bezogene Erlöse um $4 Mrd. niedriger
- Netzwerk: $7,3 Mrd. (Rekord; +46% seq, +98% YoY)
- Margen: GAAP (US‑Generally Accepted Accounting Principles) 72,4%, Non‑GAAP (bereinigt) 72,7% inkl. $180M Inventarbefreiung
- Cash & Buybacks: $10 Mrd. an Aktionäre zurückgeführt; verbleibende Rückkaufaut. $14,7 Mrd.; Inventar $15 Mrd.
🎯 Was das Management sagt
- GB300‑Ramp: Produktion und Lieferungen angelaufen, ~1.000 Racks/Woche; nahtlose Migration von GB200; erhebliche Inferenz‑Effizienz (Token/Watt ≈×10 vs. Hopper)
- Rubin‑Plattform: Rubin‑Chips in Fertigung; Volumenstart für nächstes Jahr bestätigt; jährlicher Produktrhythmus bleibt
- Plattform‑Argument: Management betont Full‑stack‑Vorteil (GPU (Graphics Processing Unit), CPU, SuperNIC, NVLink, Spectrum‑X) und Entwicklerökosystem als Abgrenzung gegenüber ASIC‑Projekten
🔭 Ausblick & Guidance
- Q3‑Umsatz: Erwartet $54 Mrd. ±2% (≈ +$7 Mrd. seq)
- Margen & Kosten: GAAP/Non‑GAAP Bruttomargen ~73,3%/73,5% ±50 Basispunkte; GAAP OpEx ≈ $5,9 Mrd., Non‑GAAP OpEx ≈ $4,2 Mrd.
- H20 & China: Guidance schließt H20‑Lieferungen nach China aus; falls Lizenzen erteilt werden, potenziell $2–5 Mrd. Zusatzumsatz in Q3
- Jahresblick: Operative Aufwendungen jetzt erwarteter Anstieg im hohen +30% YoY; Steuersatz ~16,5% ±1%
❓ Fragen der Analysten
- Wachstumstreiber: Analysten fokussierten auf „reasoning/agentic“ AI als Ursache für exponentiellen Compute‑Bedarf; Management sieht 100x–1.000x mehr Rechenbedarf für solche Modelle
- China‑Risiko: Nachfrage und Umsätze in China hängen von Lizenzen ab; Management nennt China‑Opportunity (~$50 Mrd. Marktpotenzial) und bleibt in Gesprächen mit US‑Behörden
- Wettbewerb: Fragen zu ASICs beantwortet mit Betonung der Plattform‑Flexibilität, Ökosystem und Perf‑per‑Watt als Schutzfaktoren
⚡ Bottom Line
- Kurzfassung: Starkes, margenstarkes Quartal mit klarer Produkt‑ und Markt‑Momentum (Blackwell/GB300); Q3‑Guidance hoch gesetzt. Wichtige Risiken: geopolitische Unsicherheit rund um H20‑Verkäufe nach China und gestiegenes Inventar. Aktionärsfreundliche Kapitalrückführung bleibt robust.
NVIDIA — Shareholder/Analyst Call - NVIDIA Corporation
1. Management Discussion
So thanks for joining us. Jensen and I are here to go through any questions that you've had, both what you've seen today in our Paris GTC, but also we have done several other GTCs over the last couple of months, adding up to that. Probably the most important thing to understand moving here and doing so much of this here on Europe, in the EU, in France as well as our time that we spent in Taiwan is to really emphasize that AI is here worldwide. And the importance of seeing what is going to be possible for AI, this is an area that is growing faster than any other technology in history that has reached every single region at the speed that it is done.
What that's going to take, though, is the influence and help both of the sovereign nations, the sovereign help that we're going to need from a government to really expand both here in the EU area, in France, to do so. I know you saw so much of that today with Jensen. But we'd also like to talk more just from the investor standpoint in terms of what we're seeing.
So we're going to open it up for questions unless you want to start with some...
No. Great to see all of you. Make it nice and loud.
2. Question Answer
Cantor Fitzgerald. Two-part question. First, your commentary on quantum computing seemed to change a little bit. So curious where do you see commercialization first? And then secondly, on the sovereign front, you've been traveling throughout Europe. I think your travels continue beyond France. And would love to hear how your conversations have gone and how you think about the magnitude of coming investments relative to kind of what we heard from the Middle East.
Yes, I appreciate it. First of all, my feelings about quantum is consistent with the past. However, my feelings about quantum classical is very different. And I think that the entire industry is now recognizing that quantum classical is the way to go. It's not about a stand-alone quantum computer, it's about a quantum computer connected to a GPU supercomputer to do all of the controls, to do the error correction and the groundbreaking work that's being done in error correction is really quite significant.
Basically, if you look at a qubit today, a logical qubit is represented by a cluster of physical qubits. Are you guys -- anybody -- am I talking weird stuff? So these physical qubits, it takes a whole bunch of working together entangled together to represent a logical qubit. And then you have a bunch of ancilla qubits which are basically shadow qubits because as you know, the Schrödinger's Cat problem, if you observe -- if you try to measure the quantum, the qubits, it collapses the state. It loses coherence. So it's either -- it's no longer in that super positioned state. It will either be on or off. It will be the Schrödinger's Cat, it will be dead or alive, but it's never been super positioned.
The recent breakthroughs in using error correction requires a lot of computing outside the quantum computer. And we're making really great breakthroughs there. And so if you look at the GPU supercomputers that you're going to be connected to these quantum computers, they're going to be giant, just doing the error correction stuff. If we keep going at this rate, let's say that we get 10x as many logical qubits every 5 years. So we'll probably have something close to 20 to 100 logical qubits in some 5 years. 100 logical qubits, just the number of -- the amount of state that it could represent is sufficient to do some early biomolecular or chemistry stuff, material work that could be quite useful.
And the way that we're going to do it, and this is the reason why I think the community is getting together on this idea that instead of using the quantum computer to do all the simulations, what we do, we're going to use quantum computer to generate ground truth, the electron simulations, if you behave like a quantum, behave like an electron state and then generate a whole bunch of synthetic data that will train AI models with. Are you guys following me? So this quantum classical hybrid is gaining a lot of momentum right now. And I think everybody is getting excited. So we can kind of see it being kind of 2 or 3 years out, doing some real work. But in the meantime, what I said is true. Every single supercomputing center is going to go quantum classical, every -- 100%. I've not met one that's not going to go quantum classical.
And that's why CUDA-Q is such a revolution. We're basically working with everybody in the quantum computing industry on CUDA-Q. And so with respect to the build-out here is much more for local use, indigenous use. Middle East was some indigenous use, but it's really about hosting the cloud for American companies. And so it's related, not exactly the same. Does it make sense? Most of the stuff that we're talking about here, the telcos, the regional cloud service providers, the 20 giga factories that AI factories that's going to get built that's supported by the government across pan-European countries. That's all being built for local consumption.
And I think that long term, it's just going to represent -- sovereign AI should represent the GDP of the countries. In the case of Europe, it's been -- it's taken longer to get engaged. And the reason for that is because their information technology industry is lighter than the United States, but their heavy industry is much bigger than the United States. That's the reason why robotics is going to be such a big deal here. Industrial digital twins is going to be a big deal here. All the factories are going to be digital. AI is going to be everywhere in those factories. And that's the reason why the topic here is quite different than the topics in the United States.
Overall, the world's regions combined, we estimate over the course of the next several years, about $1.5 trillion worth of build-out. And so you're kind of -- once you cobble up all the math, it kind of makes sense.
Joe Moore, Morgan Stanley. You talked a lot about physical AI today. Can you talk about what we should look for to sort of see the model development? And is it going to be the start-ups developing physical AI capabilities, new models? Or are you actually seeing those physical AI -- is that information getting incorporated into the LLMs that are already foundational?
The physical AI models are going to be different than the LLMs. They're going to be multimodal. Like, for example, you'll walk up to a robot and just tell it to do something and just as generative AI can generate pixels just by your prompts, you should be able to generate motion from the prompts. And you'll reason about it. Just like generative AI right now can reason about the prompts and reason about the pixels before it generates the pixels, you can now reason about the motion before it generates the motion. And you could see the robot thinking, okay, I've been asked to put this apple in that drawer, but the drawer is not open. So first, I have to open a drawer. And then I got to pick up the apple, put the apple in the drawer, close the drawer.
You guys -- and so that reasoning process, you can kind of see it happening now, right? Because you guys see it in GPT o3 or DeepSeek or you could see the technology exists to do all that. My sense is that the countries here, Germany has a lot of robotics capability. France has robotics capability. U.K. has -- I mean, there's -- because the heavy industry is quite -- the Nordic countries have a lot of robotics, ABB, for example, lots of robotics capability here. And they've just been missing the software capability, if you will. And previous generation robots are all prearticulated. Do you guys understand what I'm saying, pick this thing up from here, put it over here, 100% every single time. So it's preprogrammed.
Now you don't have to preprogram it, you just tell it to do it. And as a result, robots will be -- robotics will be much more accessible to the smaller and medium-sized companies.
It's Brett Simpson at Arete. Jensen, I just wanted to ask about gigawatt. These gigawatt projects that are being announced. It's fairly new concept to us all here. But how many do you have line of sight looking into the next 2, 3 years? How many projects -- how many gigawatt projects do you think are already underway? I guess there's one coming in France. It's been announced already. But give us a sense. And in your presentation earlier, I think you said there was 5 European gigafactories. Is that 5 separate gigawatt projects? But if you can help just give us a sense for the scale of these -- how many gigawatt projects you have?
Yes, you got it all right. We have line of sight towards the telcos, the regional cloud providers that I mentioned. For example, Mistral is the one here in France. In U.K., it's Nscale, Nebius. In Germany, they changed their name from iGenius. I think it's called [ Combin ] or something like that. But I thought iGenius was pretty good. I'll go find out why they changed. But these are all line of sight. And then the 20 that are -- none of these are supported by government. These are all business-oriented start-ups or scale-outs.
The ones that are supported by government are the 20 AI factories and a handful of them are gigafactories. That's what we have line of sight on at this moment. There'll probably be more. If you just kind of added up all of the -- everything that I just said, it is lower than the GDP representation of Europe. Now of course, for some time, the American cloud service providers will come and serve that, okay? So over time, maybe the regional cloud providers will get larger and larger. And because you have sovereignty issues with respect to data privacy and generally, people being concerned about geopolitics these days. So you kind of want to have infrastructure locally in each one of these countries. There's a reason for some of the build-out.
Louis Miscioscia, Daiwa Capital Markets America.
Maybe if you go into what the limiting factors are both for you to produce more of your products and then from a small scale. And then from a big scale, you mentioned like that maybe some European companies don't have the software ability. What also could just drive additional demand for all this AI stuff that we see on these -- at your conference here, which is pretty amazing.
The supply -- none of the supply is horribly difficult to get now. it's constrained, but we're still growing fairly fast. So nothing is sitting around. We don't have a whole bunch of Blackwells and CoWoS-Ls and these supercomputers sitting around. They build what we ask them to build. And so we have to forecast it, but we're not limited by CoWoS. We're not limited by [ HBM. ] I just have to forecast it. And our lead times are probably more than a year. From the time that I start wafers on Blackwell to the time I ship a supercomputer out the door, it's coming up close to a year which is a real advantage for us. And the reason for that is because I have a better feel of the total consumption in the world than just by anybody.
And so I could place a giant order on TSMC and Micron and Hynix and Samsung and SPIL and Amkor and Foxconn, I mean our supply chain is massive. And we could place a few hundred billion dollar order on our supply chain because we have great confidence of the end market and the fungibility of our products everywhere. If we were somehow bespoke, if you will, it's only useful for this customer, then it's harder for us to have the confidence to build for the whole market. Our confidence -- NVIDIA is everywhere. And so we're not so much limited by any critical component per se. It's just everything we build is not easy per se. And so we just have to forecast it.
In terms of the end market, there are several things that limit the end market. One of them is just local languages. We think that everybody should speak English, but they don't. And some people prefer interacting with their devices in their native language and which is very understandable. There's -- of course, if you want to reach the whole population in Israel and Israel comes to mind because I was just working on it, you're going to need a large language model trained in the language and the data and the customs of Hebrew.
And so the same with Arabic countries and so on and so forth. You just multiply that out. If we want AI to be successful in each one of these regions, the technology, the agentic technology is there, but the reasoning AI language model needs to exist. And so I was talking about that today. All of those partners of ours that's going to take NVIDIA's Nemotron and optimize it for their local language, they now have a state-of-the-art capability. They already have the data prepared for other local languages, and we'll fine-tune each and every one of them. Each one of them will probably take about a month of supercomputer work, but it's not so bad. And then we take that model. Now we have to connect it into a search system and perplexity is ideal. We just plug it right into Perplexity and off they go. That's the idea.
It's Timm Schulze-Melander from Redburn.
I just had a follow-on actually from Joe Moore's question about sort of model progress. So anecdotally, there's lots of excitement, lots of enthusiasm, but also investors that I speak to who may be on the more skeptical side point to the fact that MMLU scores are topping out and maybe there's some impatience for more tangible real-world applications. So as you work across all of the vectors at NVIDIA, could I just ask maybe for multimodal large language sort of reasoning models, what are your sort of preferred measures of AI to capability? How do you keep...
Really good question. As you know, the reason why reasoning is such a breakthrough is because a reasoning model can solve a problem that has never seen before. First of all, makes sense, right? Because you're breaking it down step by step and each one of those steps, you know how to do. And one of those steps might be read this document, learn it, come back and do the next step, okay? And so the reason why agents are so much more effective than a pretrained reasoning model is because the agent can benefit from context. Go read this document and the document tells you exactly how to do it, come back and do it. MMLU doesn't do that.
An open language model sitting out in free space don't have the benefit of your fine-tuning, your training. That's the reason why enterprise models are going to be so good. And we're experiencing this all over the place. The work we do with ServiceNow and SAP and Cadence, they're all super agents, but they're narrowly super agents. And we give them context and retrieval augmented generation, we fine-tune them. We teach them human demonstration. Does that make sense? Our goal is to design a chip. I don't need you to be a history expert.
My goal is to do supply chain management. If you don't know anything about taxes, I'm going to survive. Do you see what I'm saying? And so we take these reasoning models, these agents, and we fine-tune them into the job we need them to do. That's the reason why. Don't worry about -- these AI models are going to get better and better and better, no doubt. Just look at the curve, it's going to get better, but who cares? My job is not to wait for artificial super intelligence. I just want to do a good job with my supply chain management.
Antoine, New Street Research.
So thank you very much for sharing the growth forecast earlier in Europe in terms of capacity. It seems that Europe alone could be a very strong driver of growth in 2026. And so that actually got me wondering more generally as we get closer to the middle of 2025, how should we be thinking about growth in data center for NVIDIA next year because now we have some hyperscalers who have guided already 2026 for CapEx. Broadcom last week, I think, said that they expect sustained revenue growth into 2026. And so that means that they're getting some visibility, right? So I assume you should also be getting some. And any comments you can make would be very helpful.
Everything that I told you guys today is in addition to CSPs, the American CSPs. And most of Europe is underserved today. And even the part that are served, the newest generation don't come out. There are so many developers and researchers that are still using Amperes. They don't even have Hoppers barely. And so that's the opportunity for the local CSPs. They could deploy the best as soon as possible and don't let it diffuse out from -- you don't have to wait for the public clouds. So this -- all of this is incremental.
Luciano from Impax.
So I think it's quite clear that you guys are clearly dominating training, pretraining. You made the case today why inference growth is really good for you in reasoning and so on. On post-training for the big models, just wanted to know what do you think the future for your business model is there in terms of not just providing the compute, but perhaps the simulation for these models a little bit like you do in robotics, if you see something similar outside robotics as well?
Yes. Post-training is an excellent opportunity for us because post-training is just a new phase of pretraining. And the new phase of pretraining, post-training does this. So the first thing you do is you give it human demonstration. We call it reinforcement learning, human feedback, human feedback. So I give you demonstration and you try it until -- and I tell you whether you did a good job or not. That's like coaching. And then the second thing is like self-practice, reinforcement learning, verifiable results. So I give you a bunch of tests, and I say, these are the answers. I give you the test, the problem and the answers. And your job is to reproduce the results. And you just keep trying until you get it. You know what the right answer is. If you get closer, I'll give you a positive feedback. If you get further away, I give you negative feedback.
And so that could be used for coding. The results are very verifiable. It could be used for sign simulation. There's a whole bunch of tools we've already created as humans that are excellent at providing the feedback, the ground truth. That's called reinforcement learning verifiable results. All of that requires a ton of training. You just crank forever. You get -- the amount of training you can do is as much time you do have. How much human practice can you possibly do? You can practice as much as you like.
And so the -- that's post-training, very big deal. Yes, it uses a lot of compute. Basically, as much time as you have compute, you just got to decide when to pull the plug. I got to go to market. I can't wait anymore. That's -- tomorrow is the test. I got to -- you're out of time. And then for inference, as you know, right now, NVIDIA is the world's largest inference platform, right? Everybody say inference is easy, but there's nothing easy about inference. It's the hardest thing of all, and we're very successful in inference.
This is Rolando Grandi from Itavera.
My question will be about edge computing. So before the case was that the computing once on the pretrained world, we moved to the edge, on device, and that's it, right? So you mentioned right now that post-learning, reinforcement learning is getting back the processing back to the data center. But what about the use cases in robotics, in satellites? You had some announcements there. Elon is speaking about sending robots to Mars. Then the latency becomes a big issue, right? You need to have that on-device capability. So how that computing on device or on the edge works in this new reinforcement learning world?
The compute is on device. We have edge -- 4 major edge use cases. One is self-driving cars. Our autonomous vehicle business is already $5 billion a year. It's a big business, training, simulation and in the car, edge AI. The second one is robotics, and that's just starting to grow and likely to be quite large. The third is facilities. This is an edge computer that sits in a factory, in a warehouse. Those are edge devices. And there, we partner with Siemens a lot.
And then another one that you've heard me talk about is base stations. Next-generation base stations are going to be based on -- the 6G base stations are going to be based on AI. And so we have a system that's called aerial. So these are our 4 primary focus areas because the software is very, very hard. The easy stuff, we're not going to go touch. I mean -- but these 4 areas are quite hard. The computer is right in the edge. We call it Orin and Thor, yes, really amazing processors.
Piggybacking on this question I also had in mind. I mean, the key device is this one, right? If we really start using -- putting more and more AI on the iPhones or the next iPhones to arrive, how would that affect your business model with your concentration on GPUs?
The more AI they use on the device, the more AI you're going to use in the data center because you still have to train the model, you have to develop the model and verify the model, evaluate the model. And all of that's done in the data center. Our business is not on the phone. The phone is not our business. And it's plenty -- there's a lot of innovation there. We're not -- we don't build modems. We don't build low-power SoCs. That's not our core business. And it's well served anyways.
And then on the supply side, your key risk is...
The more AI, the better bottom line. The absence of AI is the only thing I worry about. Please AI.
Not much to be concerned there, I guess. The key supply risk, you haven't mentioned naturally is TSMC in Taiwan, right? I mean imagine you have devised emergency plans for that. Is -- can you speak openly about them?
We announced that we are going to build $0.5 trillion worth of AI supercomputers in the United States in the next several years, 100% from chips to packaging to integration into supercomputers. We have partners that are setting up in the United States, TSMC, SPIL, Amkor, Quanta, Wistron, Foxconn. And so they're all setting up in the United States. We're the largest customer for TSMC. They're very supportive of us. And so that's the goal.
Then we'll have -- we'll manufacture in multiple continents. We'll continue to do so in Taiwan. We manufacture in Samsung, in Korea, some of our components, and then we'll manufacture a lot more in the United States. But our supply chain is so large, we're manufacturing really almost everywhere.
How would you read the calendar reducing meaningfully your dependency to Taiwan?
This is it. We're probably going as fast as anybody in the world is going. I think the -- the real truth is Taiwan is pretty important to the world supply chain. Let's avoid conflict, job #1.
Of course. And then the last one is how would you rate Huawei's forays in AI chip manufacturing? How would you rate Huawei foray into AI chip manufacturing?
Very good. Very good. There are several years behind us, but for China, it's fine. The reason for that is this, because their power is so cheap, not because China is willing to accept less, their power is cheap. And when the power is cheap, you just use more chips. This is not like iPhone, not like a phone. In our case, the AI chips, our performance efficiency is probably 4x theirs, 5x theirs, but just use 5x more chips.
In the United States, it would never fly because the data center is 100 megawatts. If they have to use 4x as many chips, it's not going to fit. They don't have enough power. And so that data center will be 1/4 the revenues, and that would never fly but just build more data centers, use more power. And so that's why our advice is that the export control will be lifted so that we can go and compete for that business. But right now, as we speak, I just want everybody to know that we have taken China out of our forecast. We're assuming 0 because at the moment, we've been banned.
We went from a $30 billion, $40 billion a year business to 0. It's a big drop and thank goodness, our demand is so strong everywhere that we're going to continue to grow anyhow. But nonetheless, it's a big loss, okay? The important thing is we're not guessing about China. Are you guys following me? The only thing -- we're at 0. And if -- in some circumstance, the President negotiates some outcome that makes sense to them, it would be a bonus to us. But at the moment, we are assuming 0, okay? Please assume 0. No guessing. When you're at 0, you don't have to guess, you guys.
[indiscernible] from PGGM.
You have to yell at me. I will not be offended.
Apologies.
[indiscernible] from PGGM. I had a question on the reinforcement learning that you talked about. You mentioned cases that are obvious like math or coding, you can compare it. But what about basically all the other cases where you don't have a good sense of what the outcome needs to be somewhere, but the solution is a bit unknown. It's a bit hybrid. What do you see there in terms of reinforcement learning applied to those type of learning models?
Reinforcement learning is really, really good at learning how to do something that is very, very far away, many long time away from the action. I have to take one step, another step, another step, another step, another step, eventually, I get a positive or negative response. Reinforcement learning is good at that, in fact. And it's the reason why reinforcement learning is good at robotics. You say, robot, I want you to walk from here to there. And you only have 2 goals. You have to get your head as high up as you can, and you have to move in that direction as much as you can without falling.
And so that's the -- in order to do that, many joints has to happen, steps -- all these different joints have -- and there are many different motions that has to happen in order to get the head up. Reinforcement learning is very good at these long feedbacks. Yes. That's right. That's -- reinforcement learning is good at that. Yes. The reward function is very far out.
C.J. Muse with Cantor.
I was hoping you could speak to GB300 transition. I think on the last earnings call, you talked about initial output and low volumes in the July quarter and ramp thereafter. I wonder if there's any more specificity there. And just to follow on your comment earlier about the 12-month lead time from wafer to full output. I guess, how does that inform your customers lining up 200 versus 300 and perhaps is your visibility even longer than 12 months?
Yes. Yes, C.J. I appreciate it. We forecasted the GB300 transition last year, and it's showing up at the same time as we forecasted it. As you know, GB 200 was late because we had a bug in Blackwell. But B300 did not have the same bug. And so B300 shows up at the same time. The window for B200 to 300 is shorter because of that. But we were planning for this transition at this time for a year ago now. And so the transition is going just great because it's basically the same chassis going from HGX to this NVLink chassis was a huge difference. Everything was different.
The mechanical process, the mechanical systems, all the electronics are all different. Testers are different, the way you -- even the companies that tested it are different and the way we tested it are different. We used to integrate these computers at the data center. We send the computer nodes each one of the HGX and the CPU trace. We send it to the data center. They integrate it at the data center and they test it at the data center. Today, that entire thing is tested at an ODM fully tested, fully integrated, and we ship a supercomputer out the door. It's incredible.
And so the power -- the amount of power necessary in the manufacturing floor went from a few megawatts to tens of megawatts because they're basically building and testing AI supercomputers. And so everything changed. GB300, everything is exactly the same. We decided months ago that we would not change the packaging that sits on the motherboard so that everything remains the same. And that was a good decision, we're in great shape in GB300.
It's Joe Moore again. Could you talk to NVLink Fusion and the potential opportunity there? And I guess one of the more frequently asked questions I get is, are you making ASICs better with that product? And therefore, could it have any impact on the processor business?
Yes. First of all, a lot of ASICs are started. Most of them are canceled. And the reason for that is what's the point of building an ASIC if it's not going to be better than the one you can buy in some very specific measure. And we're moving so fast and the bar that we're raising is so incredible. It's not easy. It really isn't easy to build -- if it was easy to build a Blackwell and say, "Hey, I got 14 guys here. Let's go build a Blackwell." If it was that easy, well, gosh, I don't know why I'm working so hard.
Doing this for 33 years and seems harder than ever. And then somebody goes, yes, I'll do an ASIC. And so I'm delighted to hear everybody is interested in building ASICs, right? I do believe most of them are going to get canceled. However, many of them have approached us about using NVLink. And they're important people to me. The person who's asking me is not a stranger. You guys know that. This person asking me is somebody who said, "Hey, Jensen, listen, we got a whole bunch of your NVLink systems. We have a whole bunch of your chassis. We standardize on everything here. If we had NVLink, we could -- and we could put our CPU in it, say, we could just use the same chassis, extend it, and we'll buy everything else from you. We'll buy everything else from you." And you know that last part, we'll buy from you. You got me, you buy from me. Are you kidding me?
Of course, the person who's asking me needs help. NVLink is a good thing. NVLink, Spectrum-X connects into an ecosystem that I care about. I care about DOCA as much as I care about CUDA. I care a lot about NIXL as much as I care about NCCL. These are all APIs of NVIDIA's that are really important. They don't all run on GPUs. And so I care about all of my ecosystems. And like I said, it excites me when you say I will buy all -- it's a super clever strategy. It's just not easy to do. So we had to start a team to build this thing called NVLink chiplets. And then we signed up a whole bunch of partners to help us integrate the NVLink into the customers and the partners.
And then we took our IP and we made it available to Synopsys and Cadence so they could distribute it on our behalf. And so we're going to turn this whole thing into a nice ecosystem. I think it's going to work out great. NVLink is, as you know, revolutionary. It's really hard.
I think somebody said it's just Ethernet.
This morning, you just presented a new product, the NVIDIA RTX PRO server. What are -- could you give us some sense of the size of the opportunity of the use case of the customers you target with this new product?
Oh, yes, yes. I'm sorry. I got it. The world's enterprise today has no AI. Just go to every single large company, just look at them and just you pick your favorite large company, how much AI did you use in your data center, almost 0. All these data centers all over Europe, 0. And how do you bring AI to that data center? Nothing, because it's not liquor cold. They need to run Red Hat Linux. They have to run VMware. They have to run Nutanix. They want to run NetApp. They want to -- does it make sense?
It's a bunch of strange things that cloud service providers don't have to worry about. So I'm not sure which one is strange, but bringing AI to traditional enterprise IT is very hard. So the architecture has to be obedient of the past, but innovative for the future. Like, for example, RTX Pro runs Windows, that's pretty crazy. And so it's -- it runs Windows, it runs hypervisors. It runs all the things that IT managers know, they go, oh, that makes me happy. Yes.
So how big is the opportunity? Hundreds of billions. The world's IT -- enterprise IT is now just getting AI. That's why Cisco, Dell, HP, everybody, so excited about it. All of our partner -- the entire storage industry is standardized behind it.
Louis Miscioscia again, Daiwa Capital Markets. Maybe on that last answer, maybe you could just point to the 2 or 3 things that you just announced today that are the most impactful for the near term as you're trying to drive AI into the future.
Ignoring GB 200 and 300 since you guys have already heard about that, we're going to grow hundreds of millions of dollars of GB 200 and 300, just once we take that off the table, okay, assuming we already talked about that. RTX Pro, no doubt. It is the first universal AI system that you can integrate into a traditional enterprise IT organization. My IT organization doesn't know how to use GB 200. I got to go build a separate cloud for them. Their data centers says Red Hat, hypervisors, Nutanix, NetApp, those -- they use phrases like that. These are not AI cloud people. And so I need -- you're not going to change them because they've got too much software running. They have to -- they got a company to run. So we have to augment AI into them, and this is the way to do it.
And what's availability of it?
It's in production now. Availability now. Yes, please tell everybody to buy it.
It's Francois from UBS. So I have a quick question on Sovereign AI. I mean, a high topics you have been on Europe. Is there any control that you can make to this announcement in a way that if you are thinking about all this demand potential, if I'm a country, I want to build as much capacity and as quickly as possible because I don't know where the demand is going to go, but it's going to be big. So I need to build capacity big and fast.
How do you control that? Because obviously, if you do 20 gigawatt factory, $40 billion, $50 billion per gigawatt, that's a lot of money. So is there -- are there any milestone that when you go this project, you say, well, maybe in a 5 years view or maybe 2 years view and then let's see how you are in 1-year view, just to rationalize instead of having like one big year when you install all this capacity, you can do it more in a smooth manner. So I was just wondering how you deal with all the Sovereign AI and -- yes.
It gets built incrementally like you say anyways. Over the last couple of years, these companies have been building up their offtake, we come offtake. They've been building up their demand. And so -- it gets built up like that anyway, step by step. Nobody puts 5 gigawatts down and then wait for supply -- wait for demand. That's not going to happen. That's right. But you have to start because if you believe in AI in the future, you have to back off and say, okay, I need the land, I need the shell, I need the power, either come off the grid, it's either going to be generated. There's a whole bunch of questions to line up for long before building the AI supercomputer.
And so the important idea is that we're now talking about infrastructure. And so these are infrastructure time lines. And -- we've been talking to Europe now for some time. This just happens to be the visit where we talk about it with all of you. But the infrastructure is being discussed for well over a year now.
It's Brett Simpson again. I just wanted to follow up. Jensen, what do you think the useful life of NVL72 is going to look like? I mean if I look at, a lot of your customers, they're depreciating over different periods. I don't know if there's a comparison between Hopper and Blackwell, but do you think you can improve the useful life of the racks more? I mean you've got 1.2 million components, I think, in these racks, but how long do they last? Yes.
Two answers. One of it is the useful life and then the other one is your accounting life. I mean most people might account for either 4 to 5 years, depreciate over 4 to 5 years, but their useful life is going to be 5, 6 years, 7 years. And the reason for that is you just have to go back and you might hear us talk about it. Like, for example, this last 2 years, we improved the performance of Hopper by 4x, 4x. In the last 2 years, software running on x86 improved 0x.
We improved our software performance by 4x. The reason for that is accelerated computing is fundamentally different than CPUs. There's a JIT, just-in-time compilation that sits inside CUDA. CUDA is a virtual machine. I can change the software, improve the performance with new algorithms long after you bought the Silicon, and we are dedicated to that forever. That's the reason why NVIDIA is doing so well because we go back and help you improve your performance for as long as we shall live. I've got mountains of people doing that. You would never do that for architectures that are -- the installed base is so small. You do that for CUDA because if you do that for Hopper, you benefit how many people?
And so researchers, software developers, people who work on software love doing this because it helps billions of dollars of infrastructure. And so Hopper keeps getting better, ampere's, I'm still optimizing on Ampere. Ampere is now, what, 5, 6 years old. So 2 different questions. I think they are accounting life, that's up to them, but I think they're going to find usefulness for years after just as the cloud service providers are. They're very happy with the old stuff.
One maybe for Colette. So AI, when you think about AI, maybe the biggest TAM for ever for any company. So you would expect companies to not care that much about margins, just try to capture the market. However, you guys have not only amazingly growth, but also super high margins. So just trying to understand how you think about that equilibrium between growth and margins when thinking about EPS growth.
Yes. Our work that the teams have done building out the systems is not something to say that it's just a cost plus. It's been an enormous feat through software and complete engineering from the hardware standpoint to put what we put together. So we've looked at it always and every time we do it from a TCO value, what can we provide to these customers and what is their next best option that they could do. And that's how we determine an important part, which is the price.
The cost structure and even the price from the very onset is something that we will always keep about the same price, and we'll continue to fine-tune from the cost perspective. So then now comes down where do we make the investments in terms of our work and the more and more that we can think about strong new strategic investments that we can make to continue to see our platform grow worldwide. And that doesn't mean by saying I'm going to create many different options in the chip. It just says, look at the total piece as a whole. Most of the work and what you saw today was talking about CUDA, the software and everything that we need to do. Long way to go from getting to the enterprises.
The enterprises still need a lot of change management on their existing software they're using. And so our expansion and where we want to continue is take that platform and now enable every type of software system that's out there from the combination of what we put together. Yes, the margins are a strong margin. We continue to be a company quite thoughtful in terms of our investments. This hasn't been a time where we're hiring tons and tons and tons of people because that doesn't necessarily always help you. But we will continue to make the best investments, whether that is in operating perspective or using our cash. Those 2 things together will, I think, continue to enhance the true value that we can provide to investors of our full P&L to do so.
This is what happens when you're being interrogated. This will be the last question. No pressure.
It's Timm again at Redburn Atlantic. Maybe just on the NIMs, NeMo, you've talked about how you developed CUDA, how that's just an incredible part of the moat. When you think about NIMs and NeMo, could you just maybe talk about its significance in the sort of hyperscaler world today? Is NIMs, NeMo a more important part of your moat when you get into the enterprise? Maybe just to kind of give us some sense of just how big a deal it is within that overall.
Yes, great question. If you were OpenAI, you know how to build NIMs. If you were Google, you know how to build them yourself. The entire packaging of that run time, super hard. The amount of software that's inside, we call it a NIM -- thank you. We call it NIM, but the amount of software inside of it, there's CUDA, UNN, CUTLASS, TRTLM, Triton, it's basically a ChatGPT in a box. You download it, you're talking to it. It's an AI. You download it, you say, here's a video. Tell me about this video. Reason about it. Why did it do what it did? What's it going to do next?
It's weird. It's basically an AI in a container. Well, for most of the cloud service providers, they know exactly how to do this. For everybody else, they have no clue. And they shouldn't have to. We should turn it into something like a NIM. It's a modern way of packaging AI. Do you guys understand I'm saying? A long time ago, in 1993, is it? 1991, the retail box of Windows, they figured out how to package software, started the software industry. I kind of went -- the first time we thought about NIMs, I'm going, it's like they figured out packaging.
We got to figure out packaging for AI so that everybody can easily absorb it and enjoy it. And what Colette said earlier makes -- she made a super important point, which is, remember, one of the reasons why we're able to deliver the value and prove it so is because the entire system of the GPU, the NVLink, the switches, the spine, the software, everything got integrated and delivered a performance level that's 40x higher. Are you guys following me? and Dynamo, -- you can't -- there's no 40x in an ASIC. You're not going to go Hopper to Blackwell, hey, look at that, 40x. Moore's Law doesn't let you do that. Isn't it right?
You don't have 40x more transistors. How could you give 40x more flops? And so the question is, how did we get 40x more performance because we architected everything in whole, and we can deliver the software to do so. Otherwise, you're limited by gross margin plus on TSMC wafers. Does that make sense? You simply can't do what we do without understanding the big picture, architecting everything at one time, distributing the work across, pulling out these amazing things that delivers the throughput that customer goes, "You know what, I get it, I believe it, you've been doing it every time, I buy it." And then they'll appreciate the value and we can talk about value instead of cost.
Okay. It's great to see all of you in Europe. Thank you.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Shareholder/Analyst Call - NVIDIA Corporation
NVIDIA — Shareholder/Analyst Call - NVIDIA Corporation
🎯 Kernbotschaft
- Kern: NVIDIA betont globale Verbreitung von AI und fokussiert auf „Sovereign AI“ in Europa: lokale Rechenzentren, 20 AI‑Fabriken und mehrere Gigawatt‑Projekte. Positionierung als End‑to‑End‑Plattform (Chips, NVLink, CUDA, NeMo/NIMs, RTX Pro) mit großem TAM (Management nennt ~$1.5 Bio. global; $0.5 Bio. in US‑Buildouts).
🚀 Strategische Highlights
- Quantum‑classical: Fokus auf hybride Systeme; CUDA‑Q als Schnittstelle, sinnvolle kommerzielle Arbeit in 2–3 Jahren für frühe chem./bio‑Use‑Cases.
- Enterprise & Edge: RTX PRO (Windows/VM‑kompatibel) soll AI in traditionelle IT bringen; Orin/Thor und Automotive (autonomes Fahren ~ $5 Mrd. JAHR) plus Robotik und Basisstationen als Edge‑Pfade.
- Interconnect & Packaging: NVLink‑Chiplets und NVLink‑Ecosystem sollen Partner‑ASICs ermöglichen; NeMo/NIMs als „AI in a box“‑Packaging für schnellere Adoption.
🆕 Neue Informationen
- Verfügbarkeit: RTX PRO ist laut Management bereits in Produktion (sofort verfügbar).
- Build‑out‑Zahlen: Management nennt Line‑of‑sight auf mehrere europäische Gigawatt‑Projekte und einen globalen Build‑out von ~$1.5 Bio.; Zusätzlich plant NVIDIA ~$0.5 Bio. Supercomputer‑Buildouts in den USA.
- China‑Forecast: China wird aktuell in der Planung mit 0 Umsatz angesetzt aufgrund von Exportkontrollen.
❓ Fragen der Analysten
- Quantum‑Kommerzialisierung: Analysten fragten nach Zeitrahmen; Jensen sieht Quantum‑classical als praktikablen Weg, mit nützlicher Arbeit in ~2–3 Jahren (20–100 logische Qubits als Beispiel).
- Sovereign AI‑Skalierung: Nachfrage zu Anzahl/Tempo der Gigawatt‑Projekte; Management sieht inkrementelle, regional getriebene Buildouts (Telcos, lokale CSPs, Industrie‑Digitalisierung).
- Supply & Ramp: GB300‑Transition läuft wie erwartet; Fertigung Lead‑time von ~12 Monaten und Diversifizierung (US, Taiwan, Korea); China‑Ausfall als klarer Planungsfaktor.
⚡ Bottom Line
- Implikation: Evento bestätigt NVIDIAs Plattform‑Narrativ: breiter TAM, multiple kurzfristige Treiber (GB200/300, RTX PRO, Europa‑Buildouts) und mittelfristige Innovationstreiber (quantum‑classical, robotics). Risiken: kapitalintensive Infrastruktur, lange Lead‑times und aktuell ausgeklammerter China‑markt; Management zeigt operative Kontrolle und klare Prioritäten.
NVIDIA — Rosenblatt’s 5th Annual Technology Summit - The Age of AI 2025
1. Question Answer
Good morning, everyone, and welcome to Rosenblatt Securities' Fifth Annual Age of AI Scaling Tech Conference. My name is Kevin Cassidy. I'm one of the semiconductor analysts at Rosenblatt, and it's my pleasure to introduce Gilad Shainer. He's NVIDIA's Senior VP of Networking. Also, we have Stewart Stecker. He is NVIDIA's Senior Director of Investor Relations.
So on NVIDIA, we have a buy rating and a $200 12-month target price. And we're bullish on NVIDIA not only because of their leadership in AI but now their ability to expand into full rack-scale deployments, including scale-up and scale-out networks. So we're fortunate to have Gilad speaking with us today. Gilad is a networking expert. Gilad joined Mellanox in 2001 as a design engineer and has served at senior marketing management role since 2005.
And of course, NVIDIA acquired Mellanox in 2020 and Gilad serves -- he also serves as Chairman of the HPC-AI Advisory Council Organization, and he's President of the UCF and CCIX consortiums and is a member of IBTA and a contributor to the PCI-SIG, PCI-X and PCI Express specifications. So Gilad also owns or holds multiple patents in the field of high-speed networking.
So with that, first, I'll turn it over to Stewart to go over some of NVIDIA's disclosures.
Thanks, Kevin. Thanks, everyone, for having us. As a reminder, the content of this call may contain forward-looking statements, and investors are advised to read our reports filed with the SEC for information related to risks and uncertainties facing our business. So back over to you, Kevin.
Thanks, Stewart. Yes. So I'll kick off the fireside chat with a few questions, and we'll take questions from the audience also. And to ask a question, click on the quote bubble in the graphic on the top right-hand corner of your screen, and I'll read the question to Gilad and Stewart. Keep in mind that this is a fireside chat working towards the understanding of NVIDIA's network strategy. Gilad will not be taking questions around financial guidance.
So with that, thank you, Gilad, and great to see you again.
Thank you very much, Kevin.
So maybe we'll start with a very high-level question of, what is the strategic importance of networking in AI data centers?
Well, it's a good start. It's a good start. So first, the data center is the unit of computing today. Previously, it was an element or was a CPU and more GPU. But today, it's not the GPU, it's not the server. It's the data center, right? The data center is the unit of computing that we use. Now networking defines the data center. The way that you connect those computing elements together will define what that data center can do. It could range from just building a server farm all the way to building an AI supercomputer that can run a single workload at large scale and to do amazing stuff.
So the networking or it's used to call networking. I'm not referring to networking anymore. It's more like this is the computing infrastructure, okay? It's much more than a switch. It's much more than a NIC. It's a computing infrastructure. And that's why it has become so critical and so important. And that infrastructure will determine what kind of workloads you can do. What will be the efficiency of the data center? What will be your return on investment? How many users? How many workers you can bring in? How many tokens you can support? How many end users you can host on the data center?
And this is where the networking or the infrastructure is so critical. Now when you go and design a networking for AI data centers, it's completely a different task than designing networking or infrastructure for the traditional hyperscale clouds. Here, we're not talking about single server workloads. We're talking about distributed computing. We're talking about workloads that need to run on over multiple compute engines, which could be hundreds and thousands and tens of thousands and hundreds of thousands.
So you need to make sure that every GPU here gets the right throughput. Every GPU needs to be fully synchronized. So the data that goes over the network needs to hit every GPU at the same time. If you create skews on network, if you create what we call tail latency, then 1 GPU is going to finish later than others. And we all know that when you're running an AI infrastructure, the last element to complete the task will determine the entire performance of the data center. So it's the tail latency, it's the throughput, it's the latency cost. It's making sure there is a congestion control. There is a huge amount of elements that are in that infrastructure. That infrastructure will determine what you can do with the data center that you build. That's why it's so important.
Great. When I talk to investors, they've been hearing terms now that they're just a little confused on, but maybe as you talk about connecting the entire data center and even data center to data center, but the terms of scale-up and scale-out networking, that's new to some investors. So maybe if you could just give that explanation of what's the difference and why are each important?
Yes. And I'll try to make it maybe a little bit simple because I see that there is terms and people try to define what is scale-up and what is scale-out. First, we can start with examples, okay? When we design an AI supercomputer, our scale-up infrastructure is NVLink, and our scale-out infrastructure could be InfiniBand or Spectrum-X, okay? Those are the examples. Now what's the difference between them?
Scale-up is your ability to build a larger compute engine, okay? So in a scale-up infrastructure or connectivity, we're taking those GPU ASICs, let's call it like that, or GPU packages, and we want those GPU packages to behave like one. And in order to build that one, you need to scale up infrastructure. That's what the scale-up network does. It takes those components, making sure that all of the balance between them, kind of the right message rate, the right connectivity, the right elements are there in order to make those engines behave like one.
And this is why if you see Jensen keynote, he says that his GPU is the rack, right? It's GPUs, not the ASIC. It's like we have NVLink72. So that rack is the GPU and scale-up network enables that. Okay, so scale-up enables to build larger GPU out of the different ASIC components. Now once you define that larger GPU, now you need to connect those GPUs together. And how many GPUs you connect together, it depends on what kind of workloads you're going to run, right? What is the mission that you want to achieve? Connecting those GPUs together in order to form multiple GPUs that would work together and run those larger missions, this is where the scale-out network is needed, okay?
So there is different requirements from a scale-up infrastructure versus the scale-out infrastructure. One's create a larger compute engine and the other one's connect multiple compute engines in order to support the different missions that you want to run on the data center.
So just as an example, it was only last year that NVIDIA was networking scale-up 8 GPUs. And this year, it's 72 GPUs.
Right. And we talked about 576, right, in keynote that Jensen talked about that as moving forward. And it's all been determined according to what is the workloads, what is -- what are the workloads that you need to support. And obviously, as workloads continue to evolve and new workloads continue to emerge and you need to solve new things, then everything in a data center is being added or changed, right, or progress.
So one of the example is that, that unit of computing that was maybe a single GPU then becomes 8 GPUs on NVLink. And now it's 72 GPUs and it's going to go to 576. It's all in order to support what kind of workloads you need to run today or you need to provide today.
And maybe you touched on it, just the workloads. What is happening with the AI workloads and applications that are influencing some of the network requirements?
Yes. So what you actually do is you build a data center, right? And that data center is aimed to serve workloads that you define or the workloads that you want to run on the data center. So essentially, everything is -- needs to be connected together, right? And there's different elements that you can look at. First is building or codesigning the network with the compute, for example, right? And this is important because you design a data center, you don't design components.
And I'll give you 2 examples of what does it mean, codesign. One example is that in the traditional world, let's say, there was compute engines that doing compute, and then there was networking elements that were tasked to move data, right? That was the separation between them. But when you design a data center to serve the AI workloads and you have the ability to decide where to put to what, and now there is no boundaries. And for example, we took compute engines kind of traditionally run on compute components. We took compute algorithms and running them and we're running them on the network.
For example, what we call SHARP in InfiniBand is taking -- is doing data analysis on the data on the network. So the network is not just moving data, it's actually participating in the compute cycles, right? And why are we doing that? Because once you do the reduction operations, for example, on the network, you can save half of the bandwidth that you need to run and you can complete things much faster, okay? So this is an example where you move things from compute to the network.
On the other side, traditional network topology was built in the concept of a top-of-rack switch, which means that all of the NICs will go to the switch on top of the rack and then you will connect those switches together. This is the wrong thing to do if you build an AI data center. Because you mentioned, for example, in previous generation with 8 GPUs connected on NVLink. So those 8 GPUs already communicate between themselves on NVLink. They don't want really to continue and talk with themselves again. So why would you put all of those GPUs on the top-of-rack switch? It doesn't make sense. You want to spread that connectivity and have every GPU connect to other GPUs in a fabric, and this is where we created a multi-rail topology, okay?
So now the network is designed the way that the compute is running, the way that the compute algorithms are running. And then we're taking some of the compute algorithm actually running on the network because it's much more efficient to do it there, okay? So this is one element of AI workloads required to actually design a full data center, design that unit of computing, and then you want to do that in a full synergy in a full core design, okay? That's one thing.
The other thing, of course, is that AI frameworks continue to evolve, right? That's why every year, we have a new compute engine that's coming out. There is new network infrastructure or computing infrastructure coming out. There is a new GPU, there is new NICs. There is new switches in order to serve scale, okay, because we see increase in scale. You're moving from thousands of GPUs to tens of thousands. Year after, you go to hundreds of thousands of GPUs. People are talking about now the million-scale GPUs.
You need to actually be able to grow that element. You have so many routes. Just think about it, with all those GPUs, every GPU communicate on the network. There is so many routes that you need to make sure that you send them in the right direction and no one's going to collision with another one so there is no congestion on the network. So there is so much complexity in that network. And that's why you see that there is, every year, there is new generation, new capabilities, new elements that are being brought into the compute infrastructure to support the full data center design and to support the different kind of workloads that we see.
Okay. Great. You're connecting these hundreds of thousands now of GPUs. But maybe if I even roll it back a little bit with the Mellanox acquisition. At Mellanox, you had both Ethernet and InfiniBand. I guess what's the difference in the 2 offerings? Is there scale-out? And if we go into depth and talking about the scale-out networks, how do you decide whether InfiniBand the right solution or Ethernet is the right solution?
Yes. Actually, we let our customers decide what makes sense for them. And maybe I'll start a little bit in the history. Yes, Mellanox did start with InfiniBand. And InfiniBand was built for distributed computing workloads, okay? It was built in a sense that, first, it is lossless. Unlike traditional network that lossy was fine, meaning traditional network works in a way that if there is collision on the network, you don't try to solve that collision, you just drop packets because it's okay. I can retransmit the data.
But when you deal with distributed computing applications, if you drop data, you're going to retransmit the data but you don't just retransmit the data to a single GPU set, for example. The fact that you retransmitted data to a single GPU and that GPU become late in the whole scheme of the workload, now everyone else is waiting, okay? So you cannot -- you don't want to retransmit data. You don't want to drop data. So InfiniBand started as a lossless network. You don't want to drop data. You don't want to create the latency.
And then InfiniBand, in order to do that, brought congestion control and adaptive routing elements later on and so forth. And it was great for scientific computing, great for HPC and essentially, it's great for AI because AI is distributed computing. And today, InfiniBand is still the gold standard for AI. Everyone that builds a network always compare its network to InfiniBand. Even when we did Spectrum-X, kind of creating Ethernet for AI, we compared it to InfiniBand. That's the gold star. It is the gold star. It brings element that no other network exists and it's a great solution.
So if you build an AI factory, single job, running large-scale in InfiniBand, there is nothing better than InfiniBand, okay? It's the gold standard. Now NVIDIA also brought Ethernet, right? We designed Ethernet for you. And you can ask -- if you have InfiniBand and InfiniBand is so great, why did you guys bought Spectrum-X? And the reason for that is that we believe that AI is going to go everywhere, okay? Every data center will run AI. And therefore, there will be AI clouds, multi-tenancy, multi-workload and multi-users. There will be AI in enterprise, right? I'm talking about enterprise AI. We see a lot of enterprise now adopting AI.
Those areas are being built by people that are familiar with Ethernet for many, many years. They build their software stacks. They build their management tools all on Ethernet, okay? And if they continue running with Ethernet and keep their management and keep how they support their enterprise company and so forth, that would be much better for them. AI is evolving so fast and, therefore, start learning how to handle InfiniBand, for example, manage InfiniBand, meaning they're going to lose the train, right? So we wanted to help them.
We knew it's going to go to everywhere. And everywhere means that we want to bring Ethernet to AI, okay? We want to enable Ethernet as an option for AI. And for people that build AI data centers and they are familiar with Ethernet, their software depends on the software ecosystem of Ethernet, okay, all the tools that were created. Their own management infrastructure that is there and was built over the years and progressed over the years around on Ethernet, we don't want them to create -- recreate it again, okay? So for them, Ethernet is a great thing.
And this is where we built Spectrum-X. Now one important thing to kind of know what we did in Spectrum-X. Well, Spectrum-X is the first generation of Ethernet for AI because nothing in Ethernet fits AI before Spectrum-X came. Spectrum-X is actually not the first generation. And the reason is that what we did is that we brought things from InfiniBand, from the multi-generations of InfiniBand that continue to evolve over the years. We brought those elements to Ethernet, okay?
So that's why Spectrum-X on one side, it's kind of the first Ethernet for AI, but what we brought inside has years of development on the InfiniBand side. So that's why it came in very mature, very quickly and actually completely aimed to solve the problems of AI on Ethernet. And for example, we brought lossless to Ethernet because we don't want to drop packets, right? So we brought lossless. We brought adaptive routing capabilities. So you have lots of flows between GPUs. You want to make sure that every flow will go in the best available path, right? It's like solving the routes in a sense.
We brought the congestion control to Ethernet. Okay, no collisions. You want to make sure no collisions. That application, one application cannot impact another application by creating collisions on the network. We brought many things from InfiniBand into Spectrum-X and actually created Ethernet for AI. And now you have InfiniBand, which is a gold standard. And if you're running, building supercomputer in a single job, if you know how to manage InfiniBand, use InfiniBand in the past, there is nothing better than that.
And if you're running Ethernet, you can keep running Ethernet. We brought the best Ethernet for AI on Spectrum-X. And a good example for that is that Spectrum-X is running more than hundreds of thousands of GPUs, more than 100,000 GPUs in a single data center for a single workload. There is no other Ethernet technology that they managed to achieve what Spectrum-X did. And the reason is Spectrum-X is built for AI.
So we have a great Spectrum-X. There is great InfiniBand with Quantum. And now people can choose what makes sense for them based on their workloads, what they need to serve, what they're building, what's their familiarity, what is their software ecosystem and so forth.
So that's just the mix. You can put those features into Ethernet, and that's more of a physical layer features that you're doing. And so it doesn't affect the customer's Ethernet that they have been running for years?
So it's not in the physical. It's a combination. It's the physical, it's the link layer, the transport level. It's the way that the NIC runs with the switches, okay? One of the things that made InfiniBand so great is that it's a platform, okay? It's not a separate NIC, a separate switch. It's like it works together. The NIC get information from the switch network in order to determine the flow of data. The switch element knows how everything in the data center behaves, okay?
It's like you need to know not just your own status when you do routing on the network. You want to know the status of your neighbors, and the neighbors could be the NICs on the switches. Because if my neighbor switch has some issues, for example, I don't want to continue to send data to the same area. So there is a global load balancing that happens. And the NICs work in conjunction with the switch, it's a full end to end, okay?
So this covers everything from PHY to link to transport. Now on top of that, you have all the management stack and you have all the cloud management tools, for example, and hosting multiple tenants and so forth. That runs on top of that infrastructure, not on the network, okay? So what we brought into Spectrum-X cover all the infrastructure limit. Everything that runs on top of that could be the same.
And this is where it goes easily into people that build Ethernet for data centers and build their software ecosystem. Now it actually goes directly there and you bring them the elements of the infrastructure that are needed for running AI training or AI inferencing.
Okay. So your Spectrum-X could be mixed in with standard Ethernet? Some other racks could be running standard Ethernet?
So Spectrum-X is Ethernet, so it's interoperable with any other Ethernet devices. So if you build, for example, an AI data center, you built a unit of computing, right? So that means that Spectrum-X will be the scale-up infrastructure, for example, and it covers the full stuff. Now that data center can be connected to other parts of your infrastructure, right? It can connect to storage, it can connect to another data center, it can connect to users, their desktops and stuff like that.
And this is where you might see other kinds of Ethernet, right, connecting to desktop. Traditional Ethernet is great. And of course, you can connect that traditional Ethernet into that data center that has Spectrum-X for the scale-out infrastructure.
Okay, great. And maybe if we could touch on too, you had mentioned the NIC cards and your DPU or the BlueField. Can you talk about the importance of that, of having control over the DPU within that same network?
Yes. So the DPU actually bring another element of that infrastructure. And so when you build a data center, there is no 1 network. Traditional world, there was 1 network. If you go to hyperscale clouds, there is 1 network. You build an AI supercomputer, different story. And you mentioned that already because you asked the scale-up and scale-out. So here are 2 networks, right? There's at least 2 networks, scale-up and scale-out.
Now there is also access network, meaning users need to access the data center. That's the third network. Now there is also a storage access that might be even a fourth network, okay? So there is multiple elements in that AI data center, and there is different components to each. So if you look on NVIDIA AI data center, we use NVLink for scale-out. We use Spectrum-X for scale-out or InfiniBand. And that scale-out includes the switch and includes what we call SuperNIC, okay?
And that SuperNIC has compute element inside of it in order to determine that injection rate and process telemetry from data from the network and so forth. And then we have the DPU on the access network because what the DPU enables to do is to move the data center operating system from the server compute engine into something else. And that greatly help with security, okay? So if you build a data center and your hypervisor, for example, your hypervisor is going to run on the same CPU that hosts the user applications, you have a security threat because the user can get access to the hypervisor and now can control the entire data center, okay?
So in order to make it much better, you want to separate the infrastructure domain from the application domain. The CPU will host the users on the system, but you're going to run the hypervisor, for example, or other element of the infrastructure operating system on a different element, let's say, completely separate from those applications where the application is running. This is where the DPU plays a role. So we're running -- the DPU is being used in order to run the data center operating system to provision the servers, to do the secure access to the user that's coming into the data center and so forth.
So DPU is the north -- what we call north-south kind of the access network. And then SuperNIC and the switches or Spectrum-X and InfiniBand are part of the scale-out infrastructure, kind of the -- some people call it back-end network or some of them call it compute network or compute infrastructure. And then you have also the scale-up where there is another element of NVLink.
Okay, great. So the DPU also it's kind of freeing up the CPU for doing cycles of what it's good at and the DPU. So it's -- yes, that's -- maybe if we switch over and talk a little bit about the scale-up network, the NVLink. You have NVLink and NVSwitch. There's other topologies out there. I guess what's the advantage of NVLink over there's UALink and even Broadcom on their earnings call, they've been talking about just using Ethernet as the scale-up. Can you kind of give the gives and takes of each one?
Well, I can definitely talk about what we do. So first, scale-up is not easy to do. It's very not easy to do. It needs to take those GPU ASICs and make them one, okay? It needs to form like 1 unit out of a lot of ASICs together. And therefore, it's not just the huge amount of bandwidth that need to run between them. It's you need to have a very high message rate that everyone will -- all ASICs will connect and communicate together as like 1 unit, okay?
You need to have a very low latency between them. It's a very tight network. And because of that, we are trying to put everything in a rack, okay? So we can use, for example, copper for that connectivity because copper, first, it consumes zero power. And because of the huge amount of bandwidth, if you would do it on something else, it's going to be a good amount of power being consumed there. You want to make it very resilient and so forth. So we want to maximize copper. That's why we want to put everything in a rack, in closed rack.
And this is where density becomes an interesting element to deal with, and this is where we bring liquid computing into the game because we want to pack everything to increase the density so you can maximize copper and build like 1 unit, okay? So there is a good amount of complexity in actually building an NVLink element.
Now one thing that isn't as obvious that NVLink brings is it's working, okay? NVLink, it's in the fifth generation. And essentially what made InfiniBand so great is because it had many generations in it, right? It continue to evolve and it continue to be better and better and better. And that's what makes Spectrum-X so great because we took all the 25 years of InfiniBand and put it on Ethernet, okay? So putting an idea of a network and says, okay, one, the first shot, my first shot is going to make it so great. In reality, it's not the case, okay?
So this is a complicated element. There is a huge amount -- just think about NVLink72 is like 130 terabytes per second, okay, in a single rack. It's like the entire peak Ethernet traffic is just running in a single rack, okay? This is what you need to support in that set. So it's fifth generation. It continued to evolve over the years, connecting more and more GPUs. We brought SHARP into NVLink. Actually, there is compute engines. There's compute algorithms running on that NVLink when you're running everything together.
So this is kind of NVLink. I tried to give you a little bit on the complexity of it. And obviously, having the fifth generation, it just show that you evolve from GPU to GPU and you bring more elements, more capability. You need to adjust to the workloads, okay? I'm not sure that I mentioned it, but the reason that we're annual cadence on the infrastructure, not just on the compute is because the element that you need to bring into the infrastructure, including those data algorithm that are being added from generation to generation because the workloads are different, because the workloads are being modified. And as the workload is being modified, the compute algorithms need to be modified, and that's impact what you put on the infrastructure, which include NVLink and the rest.
So this is where the cadence and being robust, it's working, it's amazing technology. It's liquid cooled, it's the dense, fully copper, and that's what make NVLink, NVLink.
Great. And you also announced NVLink Fusion. So you opened it up that it's not a closed network. Can you talk about the advantages of NVLink Fusion?
Yes. So once you get a sense of how complicated scale-up is, and people might say, "No, it's easy." No, it's not easy. There is a huge amount of complexity in it. Then why not to help our customers that wants to build their own custom accelerators, for example, leverage what we invested for years building the best scale-up infrastructure with the liquid cooling, with the density, with all the aspects of that, the performance of that, why wouldn't we let our customers leverage that huge amount of investment and making it easier for them to take those accelerators that they build, those custom XPU that they build, the custom accelerators actually leverage our infrastructure to build a solution for them, okay?
We design a data center. We design it as a whole. And then you can take pieces off it. And you can take the GPU, you can take the CPU, you can take them both together. You can also take the infrastructure if you want to, okay? So this is where we build or working with ecosystem, which includes MediaTek and Marvell and Alchip Technologies and Astera Labs, for example, and CPU suppliers like Fujitsu and Qualcomm and working with them so they can leverage what we do. But infrastructure, we started this talk with saying the infrastructure becomes a key element.
And essentially by having NVLink Fusion, we enabled that key element to be used by people that needs or wants or require to build their own accelerators and now they can leverage what we did, what we designed and actually get great data center for their own custom elements.
And maybe just to understand that, is it Fusion link like you mentioned Qualcomm, just use them as an example, if their CPU wants to connect, do they pay a license for the NVLink? Or do they just start using your NVSwitch?
I think that there is element of NVLink that they will need to connect to. So essentially, they need to get the interfaces and they need to get, for example, an NVLink [indiscernible] that the CPU can connect to it. And once they have that, they connect to their NVLink Switch so they can acquire the NVLink Switch, and they can acquire the entire elements that also come there with the liquid cooling and all the stuff. So they are taking elements from us. They're taking the API from us. And obviously, we work with them, and they can build their own system.
Okay. And we are getting lots of questions from the audience. But the people, you've introduced the silicon photonic switches at GTC. People are asking NVLink, when does that go fiber? And also when we talk about scale-out, is that already fiber? Or what happens with silicon photonics tied into all of this?
Yes. So different elements here. So first on the scale-up, let's put it like that. Copper is the best connectivity. Copper is the best connectivity, zero power. It doesn't consume power. It's very reliable, okay, and it's very cost effective. So you would like to use copper as much as you can, as much as you can, for any connectivity that you can. And therefore, we are trying to put as much as compute density in a rack because within that rack, we can use copper. And that's why we're investing a lot, right, to increase the amount of compute in the rack. So we can use copper and run that because there is nothing better than copper.
Now when you go to the scale-out, this is where you're talking about distances, right? Because now we have racks that needs to be connected and you're out of the reach of copper, and you need to go and use optics and you need to use optical connections. Now in traditional data centers, the amount of connectivity between rack was very, very -- between racks was very, very small, okay? So there is no much optic transceivers or optical connections that were there.
When we look on an AI factory, every GPU has a NIC out, right? So if we look on Blackwell, every Blackwell has an 800-gig NIC that goes out. So the scale-out infrastructure, actually, there is a good amount of optic connectivity. We need to use around 6 transceivers for every GPU. So if you build 100,000 GPU data center, it's like 600,000 transceivers. And now the power that's associated with the optical network becomes something that can consume up to like 10% of compute.
So if I'm building 100,000 GPUs and I can add another 10,000, that's not a small number, okay? So now the power becomes something that you want to look how to improve it. And we all know that the limiting element in building data center is power, right? It's not really space, it's actually power. So as much as you can save power and you can redirect it to compute engines, that's a great thing to do.
The second thing is that data center increases in size, and it go fast, right? It's like 2 weeks ago, we talked about like 16,000 GPUs there, large data centers. Now you're talking about hundreds of thousands of GPUs. So 100,000 GPUs, 600,000 transceivers, and it takes time to install that and it takes time to manage that and you might need to replace elements. There is so many components that you need to deal with, okay? So this is kind of -- this is the right time for improving optical network for the scale-up.
And the way to improve that is to introduce co-packaged silicon photonics, right? And co-packaged silicon photonics, what that means? It means that instead of having the optical engines in every transceiver, I will take those optical engines and put that next to the switch and package it together with the switch. Now what did they do here? First, I reduce distances, okay? So if the optical engines in a transceiver, it needs to go the distance through the transceiver, the cage, the PCB, the substrate to go to switch, reduce the distance and with that distance, I reduce the power.
So now on the same ISO power, I can put 3x more GPUs. On the same ISO power of the network, I can connect 3x more GPUs. That's huge. Now I'm reducing transceivers, okay? Now I have 1 transceiver per GPU, not 6. Think about how many elements you reduce from the data center, which means it's not just I increase the resiliency of the data center because now there is less elements, I also reduce the time to operation. I can build the data center much faster.
So CPU brings such a greatness element, and we started with the scale-out because, again, it's like 10% of compute power. I can increase that number, it's huge, to reduce number of components. There is a huge amount of benefits bringing co-packaged optics into the scale-up infrastructure. Now on the scale-up, as I can use copper, I'm going to continue to use copper, okay? So we increase the density with copper because there is nothing better than copper. As long as you can use copper, we use copper. Okay. So this is where we continue to use copper. We announced that we are having 576 GPUs on copper, NVLink, scale-out, it's multi-racks, distance optics, this is where co-packaged optics will be a great thing.
Do you have an idea if we can get to 576 with copper? When do we have to cut over to optics?
It's a good question. Over the years, there was always people saying, "Oh, this is going to be the last generation of that." It will be the last generation of that, right? It's like every time when people say it's going to be the last generation, apparently, there is another one. So as long as we can pack, we'll pack.
Okay, good. Great. I'll see if I can -- let's see if there's a question we haven't covered yet. If you can answer this. If you're winning in the market, are you displacing what Marvell and Broadcom solutions are or solution providers like Coherent, are you replacing their designs?
The first answer is not really, okay? The reason is the following. First, there is many infrastructure in the data center, and there is many areas that requires a need to use transceivers, okay? So on the scale-out infrastructure, we're going to introduce co-packaged optics. North-south network, for example, require transceivers. We put transceivers on NIC and so forth. And since the data centers are growing and the market is growing, there is enough for everyone. And therefore, we're not replacing anything, but there is different infrastructure and there's infrastructure areas that require transceivers. That's one thing.
The second thing, we are working with that ecosystem of partners, and they are part of our CPO infrastructure. So they are contributing into what we're doing on CPO and they're bringing our elements, and we're working with the ecosystem. For example, we announced working with TSMC on packaging, but we're working with a lot of vendors that you mentioned on lasers and optical arrays and the different elements that we need for connectivity. So they are contributing to our CPO infrastructure as well, they have more or a good amount of transceivers to continue and support. And data center is growing. AI is going everywhere. There is enough for everyone.
Great. I think we're out of time here now. So I would say in summary, you've got the scale-up and scale-out networks and you're tying together hundreds of thousands of GPUs to act as one big GPU. And if people want to come into the NVIDIA network, they're open to do that. You think you've got the right solutions. So maybe if you want to give a closing remark also.
Yes. I think you had short questions and I had long answers, and I'm sorry for that. In the past, people data center budget was focused on, let's buy as many servers as we can. And then we have something left, we may connect them to get there. If something left after that, we may do some storage and stuff like that. I think now folks realize that the infrastructure is key, okay? It's not just network elements, buying a NIC and buying a switch, no. You're buying a spaceship, okay?
You're buying a supercomputer. You're buying something that requires the kind of to be fully synchronized with the data center, and that infrastructure will determine what data center will do, okay? That infrastructure will determine if those compute engines are just a server farm or that's an AI supercomputer for training or inferencing, okay?
So it's a key element. Its importance will continue to increase, and we'll see innovative technologies coming into the infrastructure. So it's something that keep us exciting, keep us exciting. Yes, so this is where the infrastructure is. I think more people are interested in learning about it, and I'm happy that we were able to talk today. And I hope that we provided people with more or better understanding about the infrastructure that we built.
Yes. That's great. Thank you. Thanks, Stewart. Thank you, Gilad. Thank you very much.
Thanks, Kevin. Thanks, everyone.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Rosenblatt’s 5th Annual Technology Summit - The Age of AI 2025
NVIDIA — Rosenblatt’s 5th Annual Technology Summit - The Age of AI 2025
📣 Kernbotschaft
- Kernaussage: NVIDIA positioniert Netzwerke nicht nur als Verbindungstechnik, sondern als integralen Teil der Recheninfrastruktur für AI‑Rechenzentren: Scale‑up (NVLink/NVSwitch) für „große GPUs“ im Rack, Scale‑out (InfiniBand / Spectrum‑X Ethernet) für Verknüpfung vieler Racks, plus DPU (BlueField) für Infrastruktur‑Domänen.
🎯 Strategische Highlights
- Netzwerk‑Codesign: NVIDIA verlagert Teile von Rechenoperationen in die Netzwerke (z.B. SHARP‑Reduktionen) zur Bandbreiten‑ und Latenzoptimierung; Topology‑Design (Multi‑rail) folgt dem Compute‑Pattern, nicht klassischen Top‑of‑Rack‑Prinzipien.
- Produktarchitektur: InfiniBand bleibt „Goldstandard“ für große, single‑job Supercomputer; Spectrum‑X bringt InfiniBand‑Funktionen (lossless, Congestion Control, adaptive routing) in Ethernet‑Ökosysteme, um Enterprise‑Adoption zu erleichtern.
- Infrastruktur‑Investitionen: NVLink (Scale‑up) skaliert von 8→72→Pläne für 576 GPUs pro Einheit; NVLink Fusion öffnet das Scale‑up‑Ökosystem für Dritt‑Acceleratoren; DPU/BlueField trennt Infrastruktur‑OS von Anwendungs‑CPU; Co‑packaged Optics (CPO) soll Transceiver‑Power und Komplexität massiv senken.
🔭 Neue Informationen
- Technik‑Updates: Konkrete Hinweise: Spectrum‑X wird in Produktionsumgebungen mit >100.000 GPUs eingesetzt; NVLink‑Zielgrößen bis 576 GPUs angekündigt; NVLink Fusion ermöglicht Partner‑Anbindung an NVSwitch/Packaged‑Infrastruktur; CPO soll Transceiver‑Anzahl pro GPU drastisch reduzieren (Beispiel: von ~6 auf 1) und damit Power/Installationsaufwand verringern.
❓ Fragen der Analysten
- Copper vs. Optik: Wann Optik? Antwort: Copper bleibt bevorzugt im Rack (dicht, null Verbrauch), Optik/CPO kommt bei Distanz/Scale‑out; Pack‑so‑long‑as‑possible‑Ansatz.
- InfiniBand vs. Ethernet: Entscheidung kundengetrieben – InfiniBand für mono‑job Supercomputer; Spectrum‑X für breite Ethernet‑basierte Cloud/Enterprise‑Adoption mit AI‑Fähigkeiten.
- Partnermarkt: Displacement‑Frage (Marvell/Broadcom/Coherent): NVIDIA sieht Koexistenz; Partner liefern Komponenten für CPO und bleiben im Ökosystem eingebunden.
⚡ Bottom Line
- Investment‑Relevanz: NVIDIA baut technische Moats in der AI‑Infrastruktur: parallele Angebote (NVLink, InfiniBand, Spectrum‑X, DPU, CPO) erhöhen Marktbreite und Lock‑in. Event lieferte keine Finanz‑Guidance; Anleger sollten Adoption von Spectrum‑X, NVLink Fusion‑Partnerschaften und CPO‑Rollouts beobachten.
NVIDIA — Nasdaq Investor Conference 2025
1. Question Answer
[ Welcome to the ] 2025 NASDAQ Investor Conference. My name is Janet Harbison, and I lead International Equities at Jefferies here in London. We are proud to be partnering with NASDAQ for the 10th consecutive year on this event, and I can confidently say that this is the best lineup ever.
I want to take a moment to thank Jack, Daniel McCartt and Andrea Joffe from NASDAQ for their incredible partnership and dedication. On the Jefferies side, a huge thank you to Abigail Charkham, Edyta Balsam and Tanya Khosla for their tireless work behind the scenes to ensure a seamless experience for all of you. And of course, thank you to the corporates for making the trip to London. What an exceptional lineup we have.
The NASDAQ has long been a bellwether for innovation-driven growth. Over the past 5 years, it has consistently outperformed broader market indices, reflecting the strength and resilience of technology and biotech sectors. This performance underscores NASDAQ's role not just as a stock exchange, but as a global platform for companies shaping the future, from AI and semis to cloud computing and digital health.
A few quick words on Jefferies as we will be better known to some of you than others. We're one of the fastest-growing investment banks globally with a 60-year-old firm, a $60 billion balance sheet and almost 7,000 professionals across more than 40 offices in Europe, the Middle East, Asia and, of course, the Americas. We focus exclusively on global markets, investment banking and asset management. U.S. equities are a key focus of [indiscernible] delighted that we're welcoming our Jefferies team from Stockholm, Frankfurt, Paris as well as multiple U.S. offices here today.
Within equities, clients are often surprised to learn that Jefferies now has the broadest global equity research coverage on The Street, covering over 3,500 stocks. And most recently, we've added Latin America, MENA and Canadian research. What truly differentiates Jefferies is the global nature of our business and the depth of collaboration between regions and teams.
Please do speak to me or my colleagues if we can help you with your business further or if you would like to be introduced to other parts of the business. This spirit of global collaboration and insight is at the heart of our upcoming lunchtime panel on semiconductors, where we look forward to hosting our global semis analysts: Blayne Curtis, our U.S. semis analyst; Janardan Menon, Head of European semis; and Edison Lee, Head of Asian semis. We hope you'll join us for what promises to be a fascinating discussion.
Before I close, one small ask. In 2024, we had record results in institutional investor, ranking #5. Jefferies was the most-improved firm for the fourth year running, and we have almost 80 analysts ranked in the U.S. and Europe. We care and we would be incredibly grateful for your 5-star votes in the U.S. survey that is currently running, especially for the tech team attending this conference and helping make it happen. Many of you in this room have been instrumental in our journey. Thank you for your trust and support.
It is now my pleasure to hand over to Colette Kress, CFO of NVIDIA; and Blayne Curtis, Jefferies Head of Semiconductor Research. NVIDIA has been a trailblazer in the tech industry, revolutionizing fields such as AI, gaming and data centers. Under Colette's financial leadership, NVIDIA has achieved remarkable growth and innovation, making it one of the best performing and most exciting tech stocks globally.
Welcome to the Jefferies stage, Colette and Blayne.
All right. Thank you all for joining. I'm Blayne Curtis. Obviously, you know Colette Kress, and very happy to be kicking off the conference with NVIDIA, obviously, been an incredible story over the last couple of years with AI particularly.
I think we want to start on the kind of demand side because I think one of the new interesting kind of drivers is sovereign AI. I think Jensen has kind of talked about it as the next growth driver, in fact, I think at GTC, talked about maybe sovereign would be the biggest spenders. He said non-CSPs. So maybe just kicking off there, obviously, you've been talking about it for several quarters. There's some Middle East announcements. I think Jensen promised some European, you have GTC Paris coming up here. So thank you for joining and maybe start there.
Yes. Thanks so much for having us here. I'm pleased to be here. It's been a while since I've been here at the conference and been able to speak to so many of the investors. So really appreciate that you all came out for today.
I have a little bit of an opening kind of little statement that I have to say. Before we begin, as a reminder, the content of this meeting may contain forward-looking statements, and investors are advised to read our reports filed with the SEC for information related to risks and uncertainties facing our business.
Well, first, I want to talk about some of the things that occurred over the last couple of days. Jensen was here in the U.K., working here with the Prime Minister. And the Prime Minister and Jensen together really work to develop opportunities within the U.K. and focusing on that sovereign piece of it. And we will be looking to build out infrastructure here in the U.K., supporting many of the industries that are here, many of the start-ups that are here, focusing on what they can do for AI. We know this is an important time to help them, help them in terms of building that AI and infrastructure just to start that fuel that's going to be necessary for their AI solutions.
Now thinking about that and here in the U.K., a lot of discussion about referring to it as the Goldilocks place. And the Goldilocks place was really a common way that we try and think about the importance of the great talent that is here, the great AI talent, the great start-ups that are here in the U.K., and we couldn't be more proud to be there. So we will also be here in GTC Paris. That is correct. So shortly after today, we just head on over to Paris, where we will also be talking about sovereign in a bit different part of the world in terms of the EU.
So sovereign is a very big piece and a focus of where we are concentrating. Keep in mind, the world of AI has moved probably the fastest of any other technology across the globe that we've seen in history, from the onset of what we saw in terms of ChatGPT, the instantaneous understanding worldwide on how important AI would be for our future. And all countries, all enterprises, all people, all consumers are all thinking about how AI would work there.
We're happy to be just a proud partner with so much of that work in terms of our platform and what we've put together. But sovereign is a big piece. We have been in the Middle East, as you indicated, and we ended up speaking with not only the Saudi Arabia, but also the UAE. And I think it led to what you heard is the tens of gigawatts that would be available through many of those nations. It was an important time because it was U.S. government together with what we were seeing in the Middle East leadership as well. And I think that will be a great start for such an important part in what they can do to influence both from the capital, the data center and our help from a platform to do so.
How large is sovereign? How large is sovereign is always the question in front of us, but it is going to be a very, very large piece. Look at it in this perspective: every country will need their own ability to have their AI within their country. Using just one standard foundational models and some of them that are available in the United States, you're going to see many of these foundational models begin in a lot of the countries that are here. That's the ability for you to have your own language, your own culture, your own data that you will likely want to keep inside of that country. That's why the sovereign piece is such an important piece for us.
It will be just as your GDP would likely be, growing as your GDP does and being a very big part of that. So we see in just right now, probably tens of billions of dollars that will be surfaced. But again, when you look at the size of this, you can be approaching over several, several years, could be close to $1 trillion. So these are key areas about why we're here, why we're here in this part of the world and why we're focusing on a lot of different parts because sovereign is going to be a big piece.
I want to follow up. You kind of partially answered it, but the question I get a lot is, who's the ultimate customer? You have a sovereign-funded data center in the Middle East per se. Is it the customer going to be Microsoft and it's just a regional data center? Or I think you answered there will be specific national efforts, models, data and such. Maybe you can elaborate on that.
And then in terms of timing, I get this a lot as well. We've seen some announcements. I'm assuming these are massive data centers, gigawatts, probably need to build buildings first and then fill them. So maybe you can walk us through a little bit of the timing behind some of these statements.
Yes. We get a lot of discussion, and there's a lot of folks interested in being a part of sovereign. All will be partners within what would be built in terms of sovereign. When you think about what is necessary, and each country probably being a little bit different, what participation does the government issue in many of the countries. Remember, they're also very important in terms of the telecom business or what we need for the Internet. You can imagine what they are using for AI will also be part backed by parts of the AI. What we'll see, though, is not necessarily a standard model. Every country will probably do that different. But the governments are very focused in terms of what they need to do to support the country as a whole and have been a very big part of a lot of the fundraising that will be necessary.
You then go into who is that builder? The builder can be absolutely CSPs that you're seeing, but you also see a new brand surfacing that will also be very important and we refer to often as the neoclouds. The regional clouds that will be stood up that may not be standard with the larger clouds that you see, but really customized and be focused on more of a private cloud, providing specific data or a specific model for 1 or 2 different types of customers. These may also be what you'll see in terms of enterprises in these nations, enterprises building AI factories through these neoclouds to be put together.
So many can contribute to that. Much of the European Union and folks had seen supercomputing as being an important industry. This can be a focus of moving towards AI included in their accelerated computing focus that they also did on supercomputing. So a lot of opportunities for all to join in that perspective.
Now how soon? What will we see first? What we heard in terms of in the Middle East, for example, some of the important foundational things that leads to these types of builds going forward is, first, the focus in terms of capital. The capital, the availability of capital that can be earmarked for these large clusters are an important piece, but also the support in terms of the data center builds. We have a separate group that are focusing on where will the power that is necessary for these data center complexity also be put together. So those are some of the first things that we're already seeing, each going hand-in-hand in terms of what we'll build in AI.
So I want to kind of finish up on the demand side. I mean, actually, to begin this year, there was a lot of questions about what the sustainability of the level of spend, which you're going to get when you see that kind of growth. I thought it was interesting, Google talked about serving 480 trillion tokens in a month, which is up 50x. So we've heard comments from the CSPs, they don't have enough GPUs. They can't serve the inference that they need to.
I'm kind of just curious from your perspective, maybe wrap in the demand for Blackwell and just overall demand perspective to start the year here and what are you seeing in terms of drivers between training and inference?
Yes. So probably since ChatGPT, a little bit more than 2 years into this, but keep in mind, we're just at some of the early beginnings of what is going to be necessary going forward. Yes, foundational models continue to be trained, but new and advanced models are very, very predominant at this time. What you see is reasoning models, taking a significant amount of need of compute. The 3 scaling laws are still a big part of it. From the foundational part and moving into reasoning models, you see a significant amount of more compute that is necessary for those models.
Additionally, that moves us to the inferencing or what we often refer to as the token generation. This is where applications coming to market and producing tokens that you are seeing both with [ new ] models and the future of agentic types of models. Agentic models are essentially doing work for you, not just actually reasoning and giving you answers. It would be great to see so much of the work that we do today, so much of the manual work that could be done with some of those agentic models.
Blackwell has been engineered specific for a lot of those reasoning models and particularly for inferencing. Right out of the gate, when we shipped our GB200 NVL72, several of our customers stood it up just to look in terms of the size of inferencing improvement. The inferencing improvement, as we have now focused on accelerating just about every part of that Blackwell infrastructure, has been key. That software platform, also very important in terms of influencing the inferencing performance. And as you've seen, what they can do in terms of token generation is an X factor greater than anything that they've seen before. We're seeing folks actually use our Blackwell directly for inferencing, not just for the training upfront. Both of these are important factors that are driving there.
So many of our customers absolutely see more and more needs for more compute as we continue to scale. So it is not just focused in terms of one industry or one part of the world, each and every industry is in terms of growing. So yes, as we recognized in our guidance that we provided for the quarter, we do see strong growth. We see strong growth in terms of Blackwell, even with the backdrop of some of the challenges that we've had in terms of what we had able to ship to China.
I want to ask you about the China market. Jensen talked about it being a $50 billion market. He's been quite vocal that he's kind of against the restrictions that you guys have seen. Obviously, you had the diffusion rules that went away, but that was another area that I think you spoke out against as well, trying to address this demand, be the one who does it versus something homegrown.
So maybe you can just talk about -- I think you made the comment that post the H20 ban, Jensen said that a cutdown version would maybe not be competitive and really you shouldn't think about you guys addressing China. There's still rumors about that you could cut down a chip and still address it. So what you can talk about, maybe just why China is important? And then what is your plan to address or not address that market?
Yes. So during the quarter of our first quarter in the middle of the quarter, we received notice from the U.S. government that we would not be able to ship our H20. Now keep in mind, our H20 going to China was the only product of significance from the data center that we were doing through a lot of work in terms of what we developed for them and a lot of back and forth with the U.S. government with continuous approval for them to do so and what we brought to market. And unfortunately, they chose to not allow it to go.
Now that means where we stand is in a situation that we really don't have anything for that market. We've discussed that it wouldn't be appropriate for us to just start a new chip at this point because essentially, the H20 from where it compared to our Blackwell architecture was significantly lower in terms of what we were being able to enable in China. That was about a 25x change from an H20 to what you would receive in terms of a Blackwell.
So we knew this takes a discussion with the U.S. government if anything new that we want to do. And we know that our work in China is not about us as alone because remember, there is domestic competition in China when you are not being able to ship your best to do so. So at this time, we are going to continue to work to see what would be possible, what could we do, given that we've gone through this now and have had to stop in the middle. That's not something that we want to do going forward.
It's a big market, though. China is a very, very big market. We can think about it just today or this year, probably could be about a $50 billion market. That's a great opportunity for us to continue to innovate, continue to build the platform from the U.S. to the rest of the world, and we think that's an important market for us to go and do. So again, still in discussions with the U.S. government, and we'll see.
I want to ask you on the supply side, it was another kind of concern entering the year. You look during Hopper, it was availability of CoWoS and supply issues more on the chip level. This time around with the GB200, it's more of a system issue, and it's not that you ran into one huge problem. It's probably a lot of little problems in just standing these racks up. So I think the interesting comment that was made on earnings was that you actually shipped 1,000 racks per customer per week, which is obviously a huge number, but I think the point was that you're starting to catch up.
So maybe you can elaborate on just the supply equation. People -- you've had a decent Blackwell number in terms of revenue, but people look at these downstream data points and the amount of racks that the ODMs can produce, and it had been quite a low number to start the year. How is that improving?
Yes. So our Blackwell architecture was a phenomenal decision on an architecture change. What we did is we pretty much shipped to our customers a full data center scale versus what they had seen in, for example, the Hopper architecture, which was a standard classic configuration of what we would be selling, which would be a motherboard with about 8 GPUs in it.
So moving to what we did with Blackwell was the importance of understanding each and every part of that data center needed to be accelerated to focus in terms of continuous performance improvement and the best efficiency from a power perspective. So that configuration, as sophisticated as it was with probably about 1.2 million different components in it, landed with many of our system integrators, our OEMs and ODMs, working to get it pretty much what they would do to build out a data center and get that into market.
So nothing unique about that. It was just the change in terms of what they had received earlier to getting a full data center. All is moving now quite seamlessly. And yes, we are getting them back up to levels so that they can move what they had received and getting them stacked into the data centers, all racked up and many of them have already started their work in terms of starting workloads on those systems.
We have also indicated, important for our Blackwell, is our next architecture moving to the 300 series or Blackwell 300. It will be the same -- pretty much the same architecture, same electronics, same mechanicals. Change in terms of the chip or change in terms of the memory is probably the only change to that. The customers are well briefed now on how to build out those GB systems, and we're excited to see that now in the next architecture as well.
I was actually going to ask about that in terms of not only GB300, but also the first generation of Rubin, per your road map, it's the same rack effectively. So kind of also want to just ask you, yes, in terms of the concern people had is that maybe there was chips at the ODM and somehow that wouldn't clear through. So I mean, I guess, in terms of...
All moving quite well. All moving well. The speed of what they're moving, what's going to be important as the next thing is getting them on a cadence where they have it all showing up and moving quite quickly. Just as we do our supply chain, you're now seeing this to be an important part in terms of standing up racks as well.
And would you relate the GB -- the H200 transition was pretty quick, kind of just over 1 quarter, almost all of it switched over. When you think about the transition to the GB300, which you're sampling now, should it be a similar kind of cadence given that it's so much overlap with the platform?
You're going to see both. You're still going to see the 200 series and the 300 series ship. Keep in mind, we are still shipping, for example, Hopper 200. So there is still the continuation. Many of them fill out -- their data centers fill out for certain workloads. So you'll probably see both of them continue over several quarters.
I want to ask you on the competitive side, particularly ASICs, I thought it was interesting at COMPUTEX, the NVLink Fusion allows some permutations with other people's silicon, whether it's a CPU or an accelerator. So maybe kind of talk about that strategy, what you're seeing from ASICs as competitors and why Fusion, I guess, is the question I get a lot.
Yes. So let's first start with NVLink. NVLink, as you know, is our very important connectivity that has been part of us for 5 generations of what we put into market, very important in terms of GPU-to-GPU connections as well as CPU-to-GPU connectivity. For example, in our GB200 NVL, you have NVLink or 72. And what we are doing is NVLink plus 8 switching; so a very, very important part of the configuration, working with the significant amount of traffic, particularly on an inferencing side. And we had taken the best of breed of what we had seen in our InfiniBand and enabled that now also with our switching Ethernet as well.
So going back to NVIDIA and its importance of it, this is an opportunity for folks to still maintain on our platform and get those capabilities. If they want a different CPU, yes, we have our Grace CPU, but another CPU gives them an option if they want an x86 or otherwise to still stay connected to our full platform, but having a license to that and working in terms of our networking. It could be the same in terms of ASICs as well. So our opportunity here is to continue to expand the opportunity of our platform, both with NVIDIA as well as networking.
It's a perfect lead and I want to ask you on networking. I mean I think when you look at your road map, it's not just a GPU road map. You have a half-dozen chips there that are all critical in making that system, which I think is the challenge, and we're going to have a couple of AI days coming up for some of your competitors. I mean they're going to have to answer that equation, how they match the NVL72.
So networking was $5 billion, up 64%. Maybe you can talk about the strength you're seeing. And then also Spectrum-X was $2 billion. So I think we get this question a lot, InfiniBand versus Ethernet, your Ethernet versus others' Ethernet. What kind of traction are you seeing on the networking side?
Yes, really good. Our networking is doing phenomenal. Just as we discussed, the importance of accelerating pretty much every part of that data center is going to be essential for many of these AI workloads. Our networking business continues to expand. You have teams really focusing on how well to integrate with so much of our work that we do in terms of AI. So we had best-of-breed in terms of InfiniBand, best-of-breed InfiniBand.
But keep in mind, many of your enterprises are on Ethernet, and we created Ethernet for AI inclusive of Spectrum-X. Spectrum-X has been very important for many of our hyperscalers and many of them we talked about in terms of on our earnings. What they are seeing is a great solution, Ethernet that would give them key for many of their enterprise tenants that they have there, but keeping those key points of the traffic that needs to be monitored so heavily plus many other capabilities with that.
Yes, it's reached a very strong level, and we are still also shipping a great amount of InfiniBand as well. So together with our NVLink, our Ethernet platform as well as InfiniBand, we have a really, really good attach rate in terms of what we're seeing. The time that they are usually choosing NVIDIA's networking to attach with what we have in terms of our GPUs, that can be over 70% of what we're seeing. So it's moving quite well.
You do a great job talking about the strategy. I want to ask a CFO question. So just finishing on the ramp of the GB200, I think the gross margin has been a big focus. And you did guide gross margins up sequentially, talked about mid-70s by the end of the year or in the future, maybe. I don't want to put words in your mouth. Can you talk about what needs to happen to get those gross margins to mid-70s?
Yes. So making a pretty big change moving to Blackwell in terms of that configuration and all the different types of components. We were in catch-up mode for a good part of the first and second quarter of getting Blackwell to market, and we have now gotten to a fairly solid ramp. And that's going to be able to assist us in terms of improving those gross margins as we get more volume and more that we can work on in terms of the yield plus the cost together, pieces of that. Let's not just forget all of those different components that are there to put together. So we made progress absolutely in Q1. We are guiding in terms of Q2 continued progress. And yes, we do see that path towards the mid-70s before the end of the year.
I want to -- we're running out of time, and I think I do want to ask, obviously, the data center and AI is the biggest part of the story. I wanted to ask on gaming. You saw a great deal of strength there. AMD saw strength as well. I think people -- the question is, are we seeing some sort of gaming cycle? Is this AI that's driving the demand? I'm just kind of curious from your perspective, what's driving the first half strength in gaming?
Yes. Thanks for the question on gaming. Gaming actually hit record levels, okay, record levels in this last quarter. But keep in mind, it's record levels and we are supply constrained. And so we've been working feverishly on getting our Blackwell architecture to market and the volume that we need to serve those customers. And I think we're getting stronger and stronger each quarter in terms of that size.
What are they excited about? They are excited about gaming. It is still such an important industry, but there are other useful cases that you can see in terms of AI, AI with the PC. This, in the future, is going to be an important part, whether that be for your creatives, your independents, those that are really working, but now you have a great AI PC just as much as you have a great gaming PC. So more to the growth and more to the Blackwell for gaming to come.
And in terms of expanding the story out, I kind of want to ask you an open-ended question in terms of where you see the biggest opportunities for kind of AI over the next kind of decade. You hear stories on -- obviously, you've been in autos for a while. It's funny, we're getting like a renaissance of autonomous driving if they don't burn all of them in L.A.
But I think you hear talk about humanoid robots, you might have 10 per person, biggest market ever. Obviously, it seems futuristic, but may not be as far away as you think, obviously, in the data center, lots of applications on the R&D side. So maybe you can elaborate where you see it all, where are you most excited over a longer horizon?
Yes. There's a lot of amazing work doing that will really influence so much of the AI work. And starting first, what we see right out of the gate is there are so many different software applications at an enterprise level. Infusing AI, infusing AI work within those is absolutely what you see so many of those companies working on. So those are going to be some of the first things that you see.
The focus in terms of other enterprises, not a single enterprise on the planet doesn't have a call center. And wouldn't they just love the ability using AI to make that as most efficient as well as a great experience for their customers and pulling that together. More and more agentic work will begin at the enterprises, agentic work that says, I can get that work done in hours that I'm not at work and that I can walk into the office to where that it can go through the reasoning and go through a phase that says what kind of work do you need done. I can actually see that in something such as a finance organization that says, how do I decide in terms of what we need to do in terms of booking accruals and those types of things. So a lot of work can happen agentic in terms of AI.
But you brought in a good area of the kind of the next focus. Automotive, AV cars, EV cars, such an important industry. And yes, working 10 years to really see a lot of the robotaxis on the car or Level 2, Level 3 in market. But it's also another introduction to another big industry, which is the physical AI and/or the robotics, change out some of the things that you see in terms of automotive and you can see that exact same thing coming through in terms of in the robotics.
Robotics in terms of the human eyes and the multiple brains, the brains that will be back in the data center, the brains that will actually be inside of the robots that are providing that landscape for them to actually do work as well. Manufacturing and industrial AI are very top of mind and very important in this part of the world in terms of the European side as well. Those are some of the big things that we'll see in the future.
All right. Well, perfect. Already out of time, but thank you for joining. Thank you for everybody coming too as well. Thank you.
Thank you.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Nasdaq Investor Conference 2025
NVIDIA — Nasdaq Investor Conference 2025
🎯 Kernbotschaft
- Kurzfassung: NVIDIA positioniert sich als Kernlieferant für "Sovereign AI" (staatlich finanzierte, nationale KI-Rechenzentren) und erwartet heute sichtbare Aufträge (tens of gigawatts / tens of billions $) mit langfristigem Potenzial bis ~1 Bio. USD. Parallel starkes Momentum für Blackwell (GB200) bei Inference, verbesserte Lieferkette, aber China-Restriktionen bleiben Unsicherheit.
⚡ Strategische Highlights
- Sovereign: Fokus auf nationale Modelle, Neoclouds (regionale, private Clouds) und Partnerschaften mit Regierungen; erwartete große, mehrjährige Budgets.
- Blackwell & Plattform: Blackwell-Architektur gezielt für reasoning/inference; GB200 NVL72 liefert deutliche Inference-Verbesserungen; GB300 wird gesampelt, Plattform-Continuity geplant.
- Netzwerk & Attach: Networking starkes Wachstum (Spectrum‑X/InfiniBand) mit hoher Attach‑Rate (>70%) an GPU‑Systeme; Networking wird als kritischer Systembaustein ausgebaut.
🆕 Neue Informationen
- Konkretes: Bestätigt: GB300-Sampling, Rückkehr zu höherer Produktions‑/Rack‑Cadence; Management nennt explizit Weg zu "mid‑70s" Gross Margin bis Jahresende als Ziel. H20‑Shipments nach China wurden Mitte Q gestoppt (US‑Regelung), aktuelle China‑Strategie offen.
❓ Fragen der Analysten
- Sovereign‑Kunde: Wer zahlt? Antwort: gemischt — Regierungen, CSPs und neue regionale "neoclouds"; Timing hängt von Kapitalverfügbarkeit, Netz/Power und Bauzyklen ab, bleibt vage.
- China: Warum kein Kompromiss‑Chip? Management: H20 stoppte per US‑Anordnung; kein sofortiger Ersatz sinnvoll, Gespräche mit US‑Regulatoren laufen — keine Garantie.
- Supply & Margen: Nachfrage hoch, Rack‑Builds komplex; Lieferlage verbessert sich, aber Übergang zu GB300 und voller Ramp erfordert Zeit; Mid‑70s‑Margin‑Pfad beschrieben, kein präzises Timing genannt.
📌 Bottom Line
- Ergebnis: Präsentation bestätigt NVIDIA als zentralen Profiteur der KI‑Welle: große adressierbare Märkte (Sovereign, Data Center, Networking, Gaming) und technologischer Vorsprung. Kurzfristig liefern Supply‑Ramp und China‑Restriktionen größte Volatilitätsrisiken; mittelfristig stützt bessere Lieferlage Umsatz und Margenentwicklung.
NVIDIA — Bank of America Global Technology Conference 2025
1. Question Answer
Good morning, everyone. Thank you so much for joining us on day 2 of the BofA Securities Global Technology Conference. I'm Vivek Arya. I cover semiconductors and semicap equipment here at BofA. And I'm absolutely delighted and honored to have Ian Buck, the Head of Accelerated Computing at NVIDIA join us for this keynote.
I think most of you are probably familiar with, Ian, but if not, Ian heads all the hardware and software product lines, third-party enablement and marketing activities for GPU computing at NVIDIA. He joined the company in 2004. Same year, I joined Merrill Lynch. So I guess that's the only thing we have in common, I believe. And he created CUDA, which remains the established leading platform for accelerated parallel computing. And before joining NVIDIA, he was a development lead on Brook, which is a forerunner to generalized computing on GPUs. So we are absolutely thrilled to have Ian with us.
And before I get into the Q&A, I was just asked to read a brief statement. So as a reminder, this presentation contains forward-looking statements. and investors are advised to read NVIDIA's reports filed with the SEC for information related to risks and uncertainties facing their business. So with that, a very warm welcome to you, Ian. Really appreciate having you. This is, I think, our third keynote session. So I really appreciate you joining us.
Yes. We're running in the AI time. So I've -- a year ago, feels like a lifetime. One of the most challenging parts of my job often is to try to predict the future. But AI is always surprising us.
That's right. Bigger and better.
So NVIDIA -- sorry, Ian, let's just start with the big news that kind of rocked at least Wall Street early this year, which was the DeepSeek moment. So how much of that news was a surprise to you, right, because you have followed the industry for a long time. And what does it really mean for investors who are looking at that as some big seminal game-changing moment. So what are the positive and negative implications of that DeepSeek moment from your perspective?
DeepSeek was -- there are a couple of inflection points in AI, for sure. You can go back to the original Google cap moment where AI recognized caps. You can go through the ResNet moment, you can go through the ImageNet moment. In 2022, we had the ChatGPT moment, which I'm sure the investor community all noticed as well. This -- and in January, we had the DeepSeek moment. DeepSeek itself wasn't a surprise. I think the company, DeepSeek and High-Flyer have been around for a while. I think if you look at the history of the papers they've been publishing, it is amazing work. Actually, they're one of the best CUDA developers out there in terms of getting all the way down.
And if you read that DeepSeek-R1 paper and the V3, which it was based on the amount of optimization that they've done for GPUs, for NVLink, for GPUDirect RDMA for sending data across from the GPU over PCIe to the NIC over NVLink to build a training and inferencing platform and solution and technology is truly amazing. What -- the moment though, that really activated was reasoning. It was the first open world-class reasoning model, and it was truly open. They explained how they built, how they trained it and the optimizations that they did to make it, to train it at the level of intelligence and optimize the execution of the training and inference stack.
And there's some amazing graphs in that paper that taught it. It basically -- it was a barn door moment for reasoning models in AI. And today, I think the world would agree, you can't really publish or celebrate a new model without it being a reasoning model. Reasoning wasn't new, OpenAI had been publishing papers about using reasoning. '03, '04 Mini, excellent -- and Gemini, all reasoning models. But DeepSeek really made it ubiquitous, open, and democratized it. The implications for and the impact was not understood when it got launched.
First off, it -- by being open, anyone can run anywhere today, DeepSeek-R1 is call it $1 per million token. We're a traditional LLM, like Llama 70B might be $0.60 per million token. It's a big model, 671 billion parameters. I think it's 38 billion active parameters. It has over 120-plus odd layers and 250 experts or the shared expert. Like that is like stuff that only folks like Gemini or OpenAI that level of complexity and technology, you had a world -- truly world-class open model. Running that level of complexity is really hard.
What I've -- what has happened is now that -- and what makes reasoning so useful is the fact that your output tokens, you let the model think, you teach the model to think and really kind of think out loud. If you've ever used DeepSeek-R1, it's quite amusing to watch it think. It actually is just talking out loud asking itself questions. It's actually trained itself to come up with an answer by thinking out loud and then it doesn't give you that answer right away. It actually -- you can see it, it checks the answer. So it's taught itself to check the answer and make sure it's right by double checking its math. And then it doesn't give you the answer again, it checks it a second time.
And that's very intentional. They actually train the model to think for as long as it can until it comes up with an answer, check it once, check a twice and then give you the answer. As a result, we're seeing an explosion in the number of tokens generated. You ask Llama a question. You get an answer back at about 100 words. That's it. You paid for those hundred words of -- or, call it, 200 some-odd tokens, $0.60. DeepSeek, you're actually -- it reasons for about 1,000 words, and then it gives you that 100 word answer, and it's right.
And while all those tokens you're paying for, by the way, you value at $1. So in general, DeepSeek has kind of made every model the reasoning model. The inference demand as a result has kind of exploded. The opportunity for multi-GPU, multi-node inference is everywhere. It actually had a great time for GB200 because of all those GPUs connections with NVLink and Blackwell, and you're seeing that now. With the increase in value, even of a free open model like DeepSeek-R1 at $1 per million token, it generates about 13x more tokens, 13x more tokens. That's like 20x more total market opportunity for inferencing because of reasoning. Actually, they just announced a new rev of DeepSeek-R1 on the math benchmark. They went from the AME math benchmark. They were getting about 70% accuracy, 69% or 70% accuracy. It's kind of like a C minus, 70% is like you're getting 2 out of 3 questions right. That's not that great.
They just did -- the new one they did, they just updated the R1, same model, better ways, the same cost is now 89% accurate. So they went to kind of a B+, which is basically 9 out of 10 questions right, versus 2 out of 3. And the way they did that, they taught the model to think longer. So they just doubled the number of tokens they're generating, and how much thinking out loud they did. So again, as these models are getting smarter, it is driving more output tokens, more thinking and more opportunity for token revenue.
Do you think anything that DeepSeek is doing or what's happening in China as a proxy for let's call it, CapEx constrained computing. So there is a lot more effort being made to make these things a lot more efficient because they may not have access. Do you think they are able to bend the cost curve in a way that has implications on how much spending needs to happen in this industry?
No. Actually the opposite. They just made it. What everyone was doing, they just talked about an academic paper. Computing has always been constrained. Access to compute, amount of compute, dollars of compute, capital expenditure of compute. The AI race is about regardless of how much compute you have, how efficiently you're using it, how intelligently you're using it and how much value you bring. Everybody wants -- wanted Hopper with the ChatGPT moment. That wasn't unique to DeepSeek. It was around the world. It's just do you have the engineering talent to capitalize on it to advance to code your CUDA, do your -- know your InfiniBand, know your NVLink, know your -- optimize your transformer layer.
One of the big innovations that DeepSeek did is they used a new technique called MLA, which actually is a statistical method for approximating the weights and the KV layers of the transformer layer. It wasn't a new idea. It actually been deployed in image generation, all those fun, drawing a picture of a teddy bear, swimming in Olympic lap. They were using this MLA statistical technique, but it compressed the Jesus out of the transformer layer, made it a lot cheaper by approximating and they were able to apply it to DeepSeek-V3 and R1. That was the first time we had been publicly talked about. Trust me, these methods are being deployed and optimized just not everyone wants. DeepSeek themselves are doing the world a favor by sharing some of the state-of-the-art research they're doing with the world. But it's happening everywhere. And it's happening back in Hopper. It was happening even back in the A100 days as well.
Got it. You talked with a lot of cloud customers. Many of them are developing their frontier models. Are you seeing any kind of saturation or diminishing returns in the size of -- the benefits from increasing the size of these models. There was this kind of public story about Meta's large language model where they are not getting enough ROI on it, right? So do you see any saturation in the effectiveness of these models that again, because what this community cares about is CapEx at the end of the day. So is there anything that is happening from a Western large language model perspective, that gives you a pause on how long and how big can Western AI CapEx be?
So the -- I won't get too hung up on the Behemoth question. There is -- Behemoth is an open model. There's a competition in the open space. It's hard to launch a model if it's not world class, and it relates to your brand and what you're doing versus what -- how is it compared to all the models that are out there. What I am seeing right now is the drive toward first reasoning models. They just add so much more value. They're able to think and solve a problem. And that is only based upon 2 things. One is how much knowledge they know, which is the size of the model and how good are they at thinking using that knowledge to come up with an answer to a question. Traditional LLMs, simply regurgitated with the new. Traditional Llama 70B, 70 billion parameters, it was trained on the corpus of the Internet.
When you ask a question, it is really just trying to reconsolidate the information it knows and answer your question, but I can't really think. And the -- what the DeepSeek and the other models are doing right now is they take the corpus of the Internet, they use that information to think and answer your question. So what I'm seeing -- and the more they know the quicker they can think, the more accurate the answer they come up with or the cheaper their answer is. So we have a conflation of taking all of the knowledge that they know and baking it into the model repeatedly. And the more questions get asked, the more data now or answers, they can invest into the model itself. We don't need to -- like you and I don't need to know that 50 plus 50 is 100. But that's because we just know it.
A first grader needs to actually do the math and carry the 1 and they could happen. But once they've done that, it's now part of their inherent knowledge. Think about that ChatGPT, think about Grok. Think about Meta AI. Every time someone is asking a question, they are expanding the corpus of -- they think about that answer. Now that answer gets baked into the model itself and the models are constantly training, and retraining, retraining, retraining. So they are both inferring, adding -- making money or adding value to the customers and also make it being smarter and their intelligence, how much they know is strictly the size of their model. So that's why the models when we were talking last year 100 billion parameter model plus was a rarity. Now 100 billion is kind of like sort of table stakes going to 600 billion. And obviously, we have models out there that are in the trillion, but they're not open.
So the -- that's because they're adding value. There's a benefit to that model being smarter to answer the question quicker or answer more valuable questions even further. The tricks that are happening are the tricks in executing the model. The MoE experts, which is a hard thing to do actually picking through the whole model, which parts of that knowledge I should pull from and compute on versus skip is where a lot of the innovation is happening. So there's a little bit of this race right now of model size and active parameters, traditional LLMs, they're not MoE. They just compute on every piece of knowledge they know. And you and I both know that's not very efficient to take all the knowledge you know, and process it relative to what my answer is. So the question of inference, so that's what experts are. They split the model up in little pieces. And throughout the whole thinking path, they're trying to prune and only pull in the right parts. And DeepSeek made public what a lot of what we were doing, which is having the experts in every layer of the stack. So we kind of are -- the models are getting bigger.
The active parameter -- it's a race between that and the active parameters to answer your question. You're only seeing a small glimpse in the public papers of what the true behind the scenes world-class work has actually been able to do.
So a year from now, how large of model sizes will you be talking about?
We're already using trillion parameter models today. You just don't know it. The active parameters are highly variant and the techniques and every piece of idea that you can use to trim how much compute you use, like you said in your previous question, is being applied, researched, figured out.
What then happens is that the other way of optimizing for compute is distillation. So you take the trillion perimeter model. And if you fine-tune -- you can limit the use case or limit the application to a vertical or a narrow work space, you can get -- you can reduce down to 70 or 7 billion parameter model, and there's lots of that.
Quick small models like for doing search text when you type in your text on your phone, it's expanding the sentence for you, that's a very small model, which can be finely tuned to you, personalized to you and what you may be doing at that moment. So we see an explosion of like this of vertical models. Hugging Face right now, I can't remember there's -- if you search for our Llama and Hugging Face, you're going to find bazillions of distilled models. By the way, all those distilled models also need to be computed on and they're constantly being regenerated.
The one of the big consumers of GPU is distillation, taking a big model, running inference on it, creating smaller models. So we are -- they start from a really highly intelligent one, and they distilled down. So I think we're all getting to 1 trillion parameter models now. There's talk when do we get to the 10T and how many active perimeters and what does that model actually look like in terms of the optimization stack is pretty funky.
The next topic, Ian, would love to get your perspective on is NVIDIA's competitiveness as the world moves to more inference, right? In that, training, I think there is recognition that NVIDIA has done an outstanding job. But as we go to inference, there's a fragmentation of workloads, optimization, et cetera, et cetera. One of your GPU competitor has added a lot more high-bandwidth memory, and they are saying, that's better for inference. There's a whole bunch of start-ups, right, who are promising lower cost per token, et cetera. So how do you view NVIDIA's competitiveness when it comes to the inference market? And even if we could compare it against a lot of the ASIC players that are out there.
It's a good question. NVIDIA thrives at things that are hard. We just do. We're an engineering and technology company. I've got a boss who is passionate about solving the hard problems and letting other people make money and innovate on top of what we can provide as a platform. And my life is I want to update my bio, I'm just a platform guy. I'm just constantly building technology platforms, to help other people make money. The inference is really hard. It's wickedly hard. It's actually, in many cases, while training is hard for different reasons, trying to do 100,000 GPUs or going to 1 million GPU distributed training clusters and keeping that thing going at scale is a data center scale, reliability, networking, 1 giant GPU problem. Inference is a myriad of optimizations.
You start with numerical precision, 32-bit floating point, 16-bit floating point, 8-bit floating point, 4-bit floating point, just to be -- if we can use the opportunity, Blackwell has 20 petaflops of FP4 per GPU. For petaflops, that's a lot. The fastest supercomputer in the world is measuring exaflops, which is only 1,000 petaflops, we got that in FP4. But making 4 bits work and come up with the right answer, you only have 4 zeros in months like that's not a lot of numbers. So that mathematically numerically getting an accurate answer by using only that is -- requires expertise in numerical and quantization primitives that are extremely complicated.
Go up from there, you have that -- you now distributed the model. The model hasn't been on a single GPU, single piece of silicon, I don't care who you are and in order to get performance, you have to have multiple -- you have to connect multiple chips together to run in parallel within the node. And then if you're going to do the high-value models, you're going to actually have to run multi-node and connect them all together. And you've seen how complex and we share how complex the GP200 NVL72 is.
On top of that, you have diversity of workload. AI factory is not going to run just 1 model all day long. It's easy to benchmark than one model, it is easy to optimize our own model, and certainly, it can be easy to build. If you want to just 1 run thing, you could build just -- you can tune your architecture for that, but AI factories are going to run every kind of models and the models are going to change. You're buying a $1 billion ad factory, you're going to need to capitalize that expenditure for 5 years. You damn well better make sure that whatever you buy for now, you're going to be able to run and capitalize and create value for 5 years. The future of AI is you go back 5 years ago, we were launching the first A100. I think I was still talking about ResNet today.
So that's a really important and strategic investment for companies to make sure that they're building an AI factory that can do all of those optimizations, all those techniques, run all those models today and next year and the year after that, all the way out to 2030. That's why the platform is so critical. That's why NVIDIA has got to work with every single and we do with every single AI company to make sure that our platform is constantly innovating. The innovations, we don't -- we invent do some of that technology, but the vast majority of it actually comes from all of those companies like OpenAI, like Meta, like Grok -- the Grok model at X AI and as well as the entire academic community and amazing innovations come from there and also DeepSeek.
FasterTransformer was a student, he's now a professor at Princeton. And just right there, doubled the transformer performance because he figured out a way to run it more efficiently, more accurately and with less cost. So that is the inference market is about running every model across all those factories now into the future. It's a fascinating business model. We think that data centers are bought with billions of dollars 5 years of CapEx and you end out -- end up charging dollars per hour or millions per token at this end.
So if, let's say, you were the head of AWS, how would you go about making the decision between ASICs or GPUs for your AI factory?
You should ask Matt that question. He's a good guy. I worked with him.
Well they talk a lot about Trainium.
I'm sure. I know. And they should, right, I mean, it's -- building silicon is hard. Talking to somebody who's been involved with it for 20 years. It's hard and getting even more complicated. So it's no small feat to be able to achieve even what they've achieved. And I'm super happy. I mean that's impressive what they've been able to -- anyone who's gotten over the -- survived it and been able to do multiple generations and stuck with it is -- requires almost founder level CEO commitment to make it happen.
Their values and every hyperscaler, they're all building on silicon. They are people -- and they're both our customers and also provide -- looking at alternatives and they rightly should, their own and other silicon and other opportunities out there. Each of them have to find what they need to optimize for and what they need to go serve and what they're going to do for their business.
So I can't speak for Matt's business exactly where he's going to be applying all those likewise with TPU. They're all looking at -- they have an internal workload and an external opportunity. They're all very passionate about making sure they provide our time to market, the latest NVIDIA GPUs and the customers and workloads that we bring to their clouds.
So our business with AWS and with everyone is extremely healthy and it continues to grow. AWS launched -- the first launch action, the B200 HGX. We talked a lot about NVL72, but the existing B200 HGX platform, which is just 8 GPUs and muling connected the same architecture that ChatGPT ran on with Hopper. We also do it with Blackwell. It's a fantastic inference platform. It runs all the same Hopper workloads, all on x86. It carried over and immediately provided a 3x boost for inferencing.
So everyone who is on Hopper using H100 H200 HGX, as soon as they're going to B300, immediately you're getting a 3x boost. And you see that in the artificial analysis benchmarks and everything else in terms of performance. So AWS is an excellent partner. How are they going to apply and where they see their opportunity? Everyone's got to define that niche or that area that they're going to add value with and then how they're going to engage in a community. It's actually -- it's one thing also to win on a benchmark or do a certain workload, it's a whole another game to try to activate an ecosystem and developers and your platform into the market.
Not all need to do that. And certainly, some have chosen to work on certain opportunities. But the undeniable part of it is that we're constantly making things faster. We are lowering costs. We are making things more profitable as per the DeepSeek B200 example. And we just -- we're doing that like annually. So each of them have to kind of choose where they're going to provide value or differentiate.
So if I ask the question in a different way, which is, today, if I look at $100 of spend on AI, $10 to $15 of that is going into ASICs. If we go out the next 3, 4, 5 years, what makes this $10 to $15 go to $20 to $25. What do you think would have changed in the industry or can change in the industry to make it more towards ASICs and away from merchant silicon.
Well, you look at the problem because of the profitability. In your revenue, your performance is actually your profitability, your gross margin. And you can look at like the cost reduction and we have a component. But generally, when we look at it, we look at it in terms of there's $1 billion of AI factory that you're going to generate. How many tokens is it going to output compared to the previous generation and how much more value that those tokens are going to -- not just in strict dollar -- same dollar per token in the same model, but if you can deliver 3x more tokens per second, you would pay more for that.
So the reasoning of the -- like in a reasoning model, you get your answer faster or be able to reason within a certain amount of time, you actually pay a premium for that. Asking what is 50 plus 50, go away for an hour, come back versus getting it right away is more valuable.
So the -- it's a little bit the dollar spend on a data center on chips is actually pretty small. If you actually look at the chip silicon cost or even just the price of the dollars they're spending on the chips versus everything that goes around the chips is it is increasingly a really important part of the value because if AI really isn't our inference and certainly training is because of the value of reasoning in these large models is not a single GPU chip business anymore. It's about connecting all those chips together with a high-speed signaling with and as a result, liquid cooling to fit them all in 1 small space so they can all talk to each other at those speeds.
The more you spread them apart the slower the signals have to travel. And so that's why liquid cool brings it all together. The complexity and the value that, that brings is driving up the -- it's not because we want to spend that much more money, and we want to run that fastest because the value that we bring with bringing that together drives up the revenue side of it. So I think the -- we will always look at previous generation. We'll always look at what the opportunity is and what others are able to actually achieve on the basket of workloads that we know is valuable now and what we do our best to predict what's going to be valuable in a year or 2 years' time.
And then the good news is NVIDIA is always coming up with new GPUs every year now, new architectures every year now and also optimizing the data center design every year. So I -- that makes my job a little easier. You used to have to predict a 3-year horizon. Now I can think about now in the future. And if I get -- we get to see another opportunity or we get a little bit wrong, we can just keep fixing and fixing and fixing it.
So that's the -- in terms of how do ASICs or alternatives play, I think it's going to be basically what niche, what vertical, what workload do they want to optimize for what use case and what they want to decide. NVIDIA's goal is not to run every AI model everywhere. Certainly, what goes in a ring doorbell should be what the silicon inside of ring doorbell should do or a hockey puck on your kitchen counter or what's inside of your phone and how they want to work there, where we're going to focus -- or I focus on is just the AI factory for inference and the training clusters at scale. And increasingly, those 2 things are melding together.
And then also, providing it as a platform from -- with all my cloud providers that all the startups, all the innovators, the next OpenAIs and every enterprise can get access to the technology and capitalize on the opportunity of the revenue that the token spring to them and also the token serving companies can make money on the top. So it's really important to look at the overall end-to-end value that the inference brings in terms of revenue, add to the cost of compute, which is actually going up in percent or the benefit in revenue and benefit is going up in X factors.
That's -- we're seeing that and but -- and only by providing that kind of percent to X factors, do you get a growth trajectory that NVIDIA can hopefully provide and will continue to provide in the future. So when we look at our value props, we look at our pricing, we look at our models, we're always looking at that net of through the chain, is everybody adding value? Is everybody able to capitalize it and be able to continue to scale up and grow. And if you just look at it over time, it's percents to X factors to big X factors. At GTC, you often see the big X factors kind of in there. But there is that whole model that actually gets played out in that world.
Maybe 1 last 1 or 2 things. The new sovereign AI opportunity, how incremental is it? Is it just a lot of the Western companies just deciding to spend overseas? Or is this truly incremental versus like the original build-out of the internet was pretty kind of concentrated. Now as we are starting to see all these new AI factories open up, is this truly incremental demand for this?
It definitely is. So when you go and talk to governments or nations or -- and actually, a lot of the supercomputing. My other job is HPC. I've been doing supercomputing for -- it's where kind of this whole thing started from. Those same people are now like getting -- are in the center of attention in every country because computing is important for their nations. We just did I believe it was a 10,000 Blackwell GPU AI factory in Taiwan. It's for Taiwan industry. It's owned by Taiwan. It's there to help apply AI to manufacturing, whether it be silicon or automotive or city or civil or as a resource for the country. We have -- we're seeing Japan, a country that is rich with data with unique industries, with a unique population and demographics and a country that's facing significant change and how to grow.
They're building AI -- they're building their own -- they're using that data, building their own -- they see AI as a national need or computing need in order to basically apply their data, apply AI, apply computing to their industries. And by consolidating -- by the government stepping in, by the nation sitting in, they can actually consolidate that as a national resource versus waiting for every single company or every single industry to necessarily build their own, and they can pull some of those resources, and they're a good partner with NVIDIA. Seeing the same happening in Germany. It has happened already in the U.K. These are basically -- and they know how to build them because most of those countries know why supercomputing is important. Now it's really elevated with AI to execute.
So yes, my -- the HPC and supercomputing side of the business has exploded as a result, and they know how to execute. So it is a really exciting opportunity. And every nation sees the opportunity to be a player on the stage and apply that. It starts with keeping their data, keeping their computing local and also prioritizing it.
How large do you think it can be over time?
It's a good question. Today, we are seeing about 100 AI factories being built and assembled right now across the world.
And AI factories how much like 1 billion-ish or how much is an AI factory?
Stuart and the other teams can talk to it, but we track it as a data center build that is we have either B2 Blackwell or Hopper is specifically designed for serving in for tokens for industry. And that is a number that's just going to continue to track and grow over time. The -- actually next week is GTC Paris and also IOC, 2 events at the same time, International Supercomputing Conference. You'll hear a lot about AI factories and sovereign AI and the activities.
So European Commission actually announced big projects earlier this year.
Europe gets it. They absolutely gets the fact that they can and has the capability to deploy. U.S. as well, last week launched NERSC down -- over in Berkeley across the bay, 9,000 Vera Rubin. It's actually our first supercomputer announcement with our next-generation Rubin architecture was announced with the Secretary of Energy and actually Jensen participated in the announcement that will be deployed next year. 9,000 Vera Rubins and the Mission NERSC is open science and also for industry for -- and named after -- the supercomputer is actually named the Doudna supercomputer, named after Dr. Doudna who invented -- I guess, discovered CRISPR and she was there, a wonderful woman, brilliantly intelligent. And as an example of using and why computing is important for health care and pharma discovery.
So this -- and one of the purposes of the supercomputing is to combine and figure out how to apply both traditional simulation and AI together to advance scientific discovery and needs for the nation.
Got it. And maybe 1 last question. What do you think will create a constraint on this growth? Is it access to power? Is it customers may not be able to adopt this kind of annual cadence of products? Is it just that CapEx demands are going up? Like what do you worry about the most as you look over the horizon?
There's a diversification that's happening. Of course, it was -- the business is expanding. The number of players in the data center world is expanding. Certainly, power how many megawatts do you have and how many gigawatts do you have and we track that very closely with all of our CSP partners, but also now increasingly with all of the NVIDIA cloud partners and GPU data center partners, you've obviously heard of CoreWeave, but there's Lambda, there is data, maybe there's many, many players now. And the template of how to secure data center, secure GPUs for that data center and align with customers. There's actually -- and also on top of that, the software and infrastructure necessary to operate and run. It's not even just a cloud, just a GPU factory, an AI factory, a token factory is starting to become fine-tuned and executable and operational.
So there's multiple things that are coming together to help accelerate the growth. Certainly, the hyperscalers know how to do it, and they're investing the time. You can see there how much megawatts and how many data centers. Microsoft just talked about the fact that they're this year are deploying more new capacity than the -- this year alone than all the capacity they had 3 years ago. So there is up in the right curve in terms of -- and they shared what their next-generation 100,000 -- hundreds of thousands of Blackwell GPUs under 1 site that they're building and they talked about in their build keynote. Look at Scott Guthrie's keynotes. It was great to see them talk about it.
That is now -- but there's a diversification happening in terms of where can everybody get their compute, certainly, as more enterprises needed it, as more start-ups needed, they're both going to the public clouds for sure, but they're also looking at all the regional clouds. And what they can do from a data center capacity. So the growth it require is being tracked by gigawatts of compute that's being put online, not just by CSP but by the world by all the players. The speed of which the AI -- the deployment software and stacks get standardized or commoditized or understood or how fast they can deploy. And that as a result, diversified. You certainly, hear about the big, big ones, obviously, but that is a portion of the business. There's a very long tail of and sizable part of the business that is distributed that's happening in the world, which is exciting because it's more people being able to contribute, deliver the compute and make it available.
I think the only other limiter right now is the speed at which people are coming up with new high-value models and bringing them to the enterprise. The enterprise and that's all the Fortune 500s, their ability to take an AI model and have it add value to their business, whether it's straight uplifting ChatGPT and putting that into a help or top of the search bar or to applying an ad revenue, to applying better connecting a feed to inserting the right ad or right product placement to closing and making it profitable for them.
So that is certainly happening. And that's where the speed at which that was the limiter there is just how many models, how many different techniques can be deployed in all those different use cases. It's also really hard to track. I feel bad for you guys trying to figure that out. We get to -- but if you see the activity around AI for enterprise, that is the demand generation that we're seeing across all of our consumption of our GPUs.
Got it. I know we are out of time. I did want to ask just 1 last question. What is NVIDIA's ability to monetize software? And where are you in that journey?
Sure. I'm going to pause on the public statements on software monetization because I don't have that off the top of my head. I don't want to say anything about it. But I think we get to see some of the things we've said in the past. Our -- we have sort of -- NVIDIA is an open company. So my job is to make sure that their computing platform is available everywhere. And to provide that compute, whether it be in the cloud directly, go all the way down to CUDA all the way up to running pie torch or running a model of Hugging Face. For the enterprises, there's -- companies want to work directly with NVIDIA. We have the opportunity to monetize working directly with NVIDIA on specific models. And make it available. It's not to supplant the community but to provide direct engagement, and that comes in the form of providing a supported Nemotron model, which is a model NVIDIA generates, it's trained and actually my team to provide that extra value directly to them.
The other opportunity is in the data center software itself. A lot of our partners are looking for help to provide the infrastructure, and we've talked about Lepton before that software to support the cloud to take -- it's one thing to stand up a data center full of GPUs. It's another thing to operate it as a data center and be able to serve and host and schedule and execute. That's another use case where we can provide that value. And in general, our software stack, all of our library, all of our CUDA X and all of the inferencing software like Dynamo and everything else, customers want to be able to gauge directly to NVIDIA. We also offer that as an enterprise support so that they can have a direct relationship with NVIDIA.
As our software footprint expands as where they want to engage directly with us, we can directly monetize or provide a service to them, which they want to pay for. They want that engagement and of course, as that value and as that goes to the broader enterprise, you'll continue to see that number increase.
I can go on for another hour, but we are out of time. Thank you so much, Ian. Really appreciate your insights. Thanks, everyone, for joining.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Bank of America Global Technology Conference 2025
NVIDIA — Bank of America Global Technology Conference 2025
📣 Kernbotschaft
- Kernaussage: DeepSeek‑R1 und die Verbreitung von "Reasoning"-Modellen haben die Inferenz‑Nachfrage massiv erhöht: längere Token‑Sequenzen pro Abfrage schaffen deutlich mehr Umsatz‑/Nutzungs‑Volumen und treiben Bedarf an Multi‑GPU/Multi‑Node‑Infrastruktur.
🎯 Strategische Highlights
- Plattformvorteil: NVIDIA positioniert sich als End‑to‑End‑Plattform (Hardware + CUDA + Inferenz-Software) für „AI‑Factories“, betont NVLink/GPUDirect sowie kontinuierliche Architektur‑Iterationen (Hopper→Blackwell etc.).
- Inference‑Komplexität: Inferenz erfordert vielfältige Optimierungen (Quantisierung, Verteilung, MoE/Expert‑Techniken, Distillation); NVIDIA setzt auf breite Unterstützung aller Modelle statt Nischen‑ASICs.
- Souveräne AI & Märkte: Staatsprojekte und regionale „AI‑Factories“ (Beispiel: Taiwan, Japan, Europa) sind echte Zusatznachfrage, nicht nur Umlagerung bestehender Kapazität.
🔍 Neue Informationen
- DeepSeek‑Details: DeepSeek‑R1 im Talk: ~671 Mrd. Parameter (ca. 38 Mrd. aktive), offener Betrieb zu ~$1/1M Token; gesteigerte Genauigkeit durch längeres „Denken“ treibt ~10–20× Inferenz‑TAM‑Effekt.
- Produktnachricht: Cloud‑Plattformen mit B‑/HGX‑Systemen (B200→B300) liefern laut Aussage bis zu ~3× besseren Inferenz‑Durchsatz; NVIDIA nennt ~100 im Bau befindliche AI‑Factories.
- Keine Finanz‑Guidance: Das Gespräch liefert operative/technische Impulse, aber keine Zahlen zur Umsatz‑ oder Gewinn‑Guidance.
⚡ Bottom Line
- Fazit für Aktionäre: Das Event bestätigt NVIDIAs strukturelle Stärken: Plattform‑Dominanz und ein wachsender Inferenz‑TAM durch Reasoning‑Modelle. Langfristiger Growth‑Case bleibt intakt; zu beobachten bleiben Infrastruktur‑Limits (Strom, Rack/Fluid‑Cooling), Wettbewerb durch spezialisierte ASICs in Nischen und die noch nicht voll transparente Monetarisierung von Software/Services.
NVIDIA — Q1 2026 Earnings Call
1. Management Discussion
Good afternoon. My name is Sarah, and I will be your conference operator today. At this time, I would like to welcome everyone to NVIDIA's First Quarter Fiscal 2026 Financial Results Conference Call. [Operator Instructions] Toshiya Hari, you may begin your conference.
Thank you. Good afternoon, everyone, and welcome to NVIDIA's conference call for the first quarter of fiscal 2026. With me today from NVIDIA are Jensen Huang, President and Chief Executive Officer; and Colette Kress, Executive Vice President and Chief Financial Officer.
I'd like to remind you that our call is being webcast live on NVIDIA's Investor Relations website. The webcast will be available for replay until the conference call to discuss our financial results for the second quarter of fiscal 2026. The content of today's call is NVIDIA's property. It can not be reproduced or transcribed without our prior written consent.
During this call, we may make forward-looking statements based on current expectations. These are subject to a number of significant risks and uncertainties, and our actual results may differ materially. For a discussion of factors that could affect our future financial results and business, please refer to the disclosure in today's earnings release, our most recent Forms 10-K and 10-Q and the reports that we may file on Form 8-K with the Securities and Exchange Commission. All our statements are made as of today, May 28, 2025, based on information currently available to us. Except as required by law, we assume no obligation to update any such statements.
During this call, we will discuss non-GAAP financial measures. You can find a reconciliation of these non-GAAP financial measures to GAAP financial measures in our CFO commentary, which is posted on our website. With that, let me turn the call over to Colette.
Thank you, Toshiya. We delivered another strong quarter with revenue of $44 billion, up 69% year-over-year, exceeding our outlook in what proved to be a challenging operating environment. Data Center revenue of $39 billion grew 73% year-on-year. AR workloads have transitioned strongly to inference and AI factory build-outs are driving significant revenue. Our customers' commitments are firm.
On April 9, the U.S. government issued new export controls on H20, our data center GPU designed specifically for the China market. We sold H20 with the approval of the previous administration. Although our H20 has been in the market for over a year and does not have a market outside of China, the new export controls on H20 did not provide a grace period to allow us to sell through our inventory. In Q1, we recognized $4.6 billion in H20 revenue, which occurred prior to April 9, but also recognized a $4.5 billion charge as we wrote down inventory and purchase obligations tied to orders we had received prior to April 9.
We were unable to ship $2.5 billion in H20 revenue in the first quarter due to the new export controls. The $4.5 billion charge was less than what we initially anticipated as we were able to reuse certain materials. We are still evaluating our limited options to supply data center compute products compliant with the U.S. government's revised export control rules. Losing access to the China AI accelerator market, which we believe will grow to nearly $50 billion, would have a material adverse impact on our business going forward and benefit our foreign competitors in China and worldwide.
Our Blackwell ramp, the fastest in our company's history, drove a 73% year-on-year increase in Data Center revenue. Blackwell contributed nearly 70% of Data Center compute revenue in the quarter with the transition from Hopper nearly complete. The introduction of GB200 NVL was a fundamental architectural change to enable data center-scale workloads and to achieve the lowest cost per inference token. While these systems are complex to build, we have seen a significant improvement in manufacturing yields, and rack shipments are moving to strong rates to end customers. GB200 NVL racks are now generally available for motor builders, enterprises and sovereign customers to develop and deploy AI.
On average, major hyperscalers are each deploying nearly 1,000 NVL72 racks or 72,000 Blackwell GPUs per week and are on track to further ramp output this quarter. Microsoft, for example, has already deployed tens of thousands of Blackwell GPUs and is expected to ramp to hundreds of thousands of GB200s with OpenAI as one of its key customers. Key learnings from the GB200 ramp will allow for a smooth transition to the next phase of our product road map, Blackwell Ultra.
Sampling of GB300 systems began earlier this month at the major CSPs, and we expect production shipments to commence later this quarter. GB300 will leverage the same architecture, same physical footprint and the same electrical and mechanical specifications as GB200. The GB300 drop-in design will allow CSPs to seamlessly transition their systems and manufacturing used for GB200 while maintaining high yields. GB300 GPUs with 50% more HBM will deliver another 50% increase in dense FP4 inference compute performance compared to the B200.
We remain committed to our annual product cadence with our road map extending through 2028, tightly aligned with the multiple year planning cycles of our customers. We are witnessing a sharp jump in inference demand. OpenAI, Microsoft and Google are seeing a step function leap in token generation. Microsoft processed over 100 trillion tokens in Q1, a fivefold increase on a year-over-year basis. This exponential growth in Azure OpenAI is representative of strong demand for Azure AI Foundry as well as other AI services across Microsoft's platform.
Inference serving startups are now serving models using B200, tripling their token generation rate and corresponding revenues for high-value reasoning models such as DeepSeek-R1 as reported by artificial analysis. NVIDIA Dynamo on Blackwell NVL72 turbocharges AI inference throughput by 30x for the new reasoning models sweeping the industry. Developer engagements increased with adoption ranging from LLM providers such as Perplexity to financial services institutions such as Capital One, who reduced agentic chatbox latency by 5x with Dynamo.
In the latest ELMO Perf inference results, we submitted our first results using GB200 NVL72, delivering up to 30x higher inference throughput compared to our [ 8-GPU ] H200 submission on the challenging Llama 3.1 benchmark. This feat was achieved through a combination of tripling the performance for GPU as well as 9x more GPUs all connected on a single NVLink domain. And while Blackwell is still early in its life cycle, software optimizations have already improved its performance by 1.5x in the last month alone. We expect to continue improving the performance of Blackwell through its operational life as we have done with Hopper and AMP Pro. For example, we increased the inference performance of Hopper by 4x over 2 years. This is the benefit of NVIDIA's programmable CUDA architecture and rich ecosystem.
The pace and scale of AI factory deployments are accelerating with nearly 100 NVIDIA-powered AI factories in flight this quarter, a twofold increase year-over-year, with the average number of GPUs powering each factory also doubling in the same period. And more AI factory projects are starting across industries and geographies. NVIDIA's full stack architecture is underpinning AI factory deployments as industry leaders like AT&T, BYD, Capital One, Foxconn, MediaTek, and Telenor, are strategically vital sovereign clouds like those recently announced in Saudi Arabia, Taiwan and the UAE.
We have a line of sight to projects requiring tens of gigawatts of NVIDIA AI infrastructure in the not-too-distant future. The transition from generative to agentic AI, AI capable of receiving, reasoning, planning and acting will transform every industry, every company and country. We envision AI agents as a new digital workforce capable of handling tasks ranging from customer service to complex decision-making processes.
We introduced the Llama Nemotron family of open reasoning models designed to supercharge agentic AI platforms for enterprises. Built on the Llama architecture, these models are available as NIMs or NVIDIA inference micro services with multiple sizes to meet diverse deployment needs. Our post training enhancements have yielded a 20% accuracy boost and a 5x increase in inference speed, leading platform companies, including Accenture, Cadence, Deloitte, and Microsoft or transforming work with our reasoning models.
NVIDIA NeMo micro services are generally available across industries are being leveraged by leading enterprises to build, optimize and scale AI applications. With NeMo, Cisco increased model accuracy by 40% and improved response time by 10x in its code assistant. NASDAQ realized a 30% improvement in accuracy and response time in its AI platform's search capabilities. And Shell's Custom LLM achieved a 30% increase in accuracy when trained with NVIDIA NeMo. NeMo's parallelism, techniques accelerated model training time by 20% when compared to other frameworks.
We also announced a partnership with Yum! Brands, the world's largest restaurant company to bring NVIDIA AI to 500 of its restaurants this year and expanding to 61,000 restaurants over time to streamline order-taking, optimize operations and enhance service across its restaurants. For AI-powered cybersecurity leading companies like Check Point, CrowdStrike and Paladin Networks are using NVIDIA's AI security and software stack to build, optimize and secure agentic workflows, with CrowdStrike realizing 2x faster detection triage with 50% less compute cost.
Moving to networking. Sequential growth in networking resumed in Q1 with revenue up 64% quarter-over-quarter to $5 billion. Our customers continue to leverage our platform to efficiently scale up and scale out AI factory workloads. We created the world's fastest switch, NVLink for scale up, our NVLink compute fabric in its fifth generation, offers 14x the bandwidth of PCIe Gen 5. NVLink 72 carries 130 terabytes per second of bandwidth in a single rack, equivalent to the entirety of the world's peak Internet traffic. NVLink is a new growth vector and is off to a great start with Q1 shipments exceeding $1 billion.
At Computex, we announced NVLink Fusion. Hyperscale customers can now build semi-custom CCUs and accelerators that connect directly to the NVIDIA platform with NVLink. We are now enabling key partners, including ASIC providers such as MediaTek, Marvell, Alchip Technologies and Astera Labs as well as CPU suppliers, such as Fujitsu and Qualcomm to leverage and relink Fusion to connect our respective ecosystems. For scale out, our enhanced Ethernet offerings delivered the highest throughput, low in its latency networking for AI.
Spectrum-X posted strong sequential and year-on-year growth and is now annualizing over $8 billion in revenue. Adoption is widespread across major CSPs and consumer Internet companies, including CoreWeave, Microsoft Azure and Oracle Cloud and xAI. This quarter, we added Google Cloud and Meta to the growing list of Spectrum-X customers.
We introduced Spectrum-X and Quantum-X silicon photonics switches featuring the world's most advanced co-packaged optics. These platforms will enable next-level AI factory scaling to millions of DPUs through the increasingly power efficiency by 3.5x and network resiliency by 10x, while accelerating customer time to market by 1.3x.
Transitioning to a quick summary of our revenue by geography. China as a percentage of our Data Center revenue was slightly below our expectations and down sequentially due to H20 export licensing controls. For Q2, we expect a meaningful decrease in China data center revenue. As a reminder, while Singapore represented nearly 20% of our Q1 build revenue as many of our large customers use Singapore for centralized invoicing, our products are almost always shipped elsewhere.
Note that over 99% of H100, H200, and Blackwell data center compute revenue billed to Singapore was for orders from U.S.-based customers. Moving to gaming and AI PCs. Gaming revenue was a record $3.8 billion, increasing 48% sequentially and 42% year-on-year. Strong adoption by gamers, creatives and AI enthusiasts have made Blackwell our fastest ramp ever. Against a backdrop of robust demand, we greatly improved our supply and availability in Q1 and expect to continue these efforts in Q2.
AI is transforming PC and creator and gamers. With a 100 million user installed base, represents the largest footprint for PC developers. This quarter, we added to our AI PC laptop offerings, including models capable of running Microsoft's Copilot+. This past quarter, we brought Blackwell architecture to mainstream gaming with its launch of GeForce RTX 5060 and 5060 Ti starting at just $299. The RTX 5060 also debuted in laptop starting at $1,099. These systems that doubled the frame rate/latency. These GeForce RTX 50, 60 and 50-60TI desktop GPUs and laptops are now available.
In console gaming, the recently unveiled Nintendo Switch 2 leverages NVIDIA's neuro rendering and AI technologies, including next-generation custom RTX GPUs with DLSS technology to deliver a giant leap in gaming performance to millions of players worldwide. Nintendo has shipped over 150 million switch consoles to date, making it one of the most successful gaming systems in history.
Moving to Pro Visualization. Revenue of $509 million was flat sequentially and up 19% year-on-year. Tariff-related uncertainty temporarily impacted Q1 systems and demand for our AI workstations is strong, and we expect sequential revenue growth to resume in Q2. NVIDIA DGX Spark and station revolutionized personal computing. By putting the power of an AI supercomputer in a desktop form factor. DGX Spark delivers up to 1 petaflop of AI compute while DGX Station offers an incredible 20 petaflops and is powered by the GB300 Super Chip. DGX Spark will be available in calendar Q3 and DGX Station later this year.
We have deepened Omni versus integration and adoption into some of the world's leading software platforms, including Databricks, SAP and Schneider Electric, new Omniverse blueprints such as Mega for at-scale robotic fleet management are being leveraged in Kion Group, Pegatron, Accenture and other leading companies to enhance industrial operations. At Computex, we showcased Omni versus great traction with technology manufacturing leaders, including TSMC, Quanta, Foxconn, Pegatron.
Using Omniverse, TSMC saves months in work by designing fabs virtually. Foxconn accelerates thermal simulations by 150x, and Pegatron reduced assembly line defect rates by 67%. Lastly with our automotive group. Revenue was $567 million, down 1% sequentially but up 72% year-on-year. Year-on-year growth was driven by the ramp of self-driving across a number of customers and robust end demand for NAVs.
We are partnering with GM to build the next-gen vehicles, factories and robots using NVIDIA AI, simulation and accelerated computing. And we are now in production with our full stack solution for Mercedes-Benz starting with the new CLA hitting roads in the next few months. We announced Isaac Group and one, the world's first open fully customizable foundation model for humanoid robots enabling generalized reasoning and skill development.
We also launched new open NVIDIA Cosmo World Foundation models. Leading companies include [ OneX ], Agility Robots, Robotics, Figure AI, Uber and Wobi. We've begun integrating Kosmos into their operations for synthetic data generation, while Agility Robotics, Boston Dynamics, and Robotics are harnessing Isaac's simulation to advance their humanoid efforts.
GE Healthcare is using the new NVIDIA Isaac platform for health care simulation built on NVIDIA Omniverse and using NVIDIA Cosmos. The platform speed, development of robotic imaging and surgery systems. The era of robotics is here, billions of robots, hundreds of millions of autonomous vehicles and hundreds of thousands of robotic factories and warehouses will be developed.
All right. Moving to the rest of the P&L. GAAP gross margins and non-GAAP gross margins were 60.5% and 61%, respectively. Excluding the $4.5 billion charge, Q1 non-GAAP gross margins would have been 71.3%, slightly above our outlook at the beginning of the quarter. Sequentially, GAAP operating expenses were up 7% and non-GAAP operating expenses were up 6%, reflecting higher compensation and employee growth. Our investments include expanding our infrastructure capabilities and AI solutions, and we plan to grow these investments throughout the fiscal year. In Q1, we returned a record $14.3 billion to shareholders in the form of share repurchases and cash dividends. Our capital return program continues to be a key element of our capital allocation strategy.
Let me turn to the outlook for the second quarter. Total revenue is expected to be $45 billion, plus or minus 2%. We expect modest sequential growth across all of our platforms. In Data Center, we anticipate the continued ramp of Blackwell to be partially offset by a decline in China revenue. Note, our outlook reflects a loss in H20 revenue of approximately $8 billion for the second quarter. GAAP and non-GAAP gross margins are expected to be 71.8% and 72%, respectively, plus or minus 50 basis points. We expect or Blackwell profitability to drive modest sequential improvement in gross margins. We are continuing to work towards achieving gross margins in the mid-70s range late this year.
GAAP and non-GAAP operating expenses are expected to be approximately $5.7 billion and $4 billion, respectively, and we continue to expect full year fiscal year '26 operating expense growth to be in the mid-30% range. GAAP and non-GAAP other income and expenses are expected to be an income of approximately $450 million, excluding gains and losses from nonmarketable and publicly held equity securities. GAAP and non-GAAP tax rates are expected to be 16.5%, plus or minus 1%, excluding any discrete items. Further financial details are included in the CFO commentary and other information available on our IR website, including a new financially information AI agent.
Let me highlight upcoming events for the financial community. We will be at the BofA Global Technology Conference in San Francisco on June 4. The Rosenblatt Virtual AI Summit and NASDAQ Investor Conference in London on June 10, and GTC Paris at VivaTech on June 11 in Paris. We look forward to seeing you at these events. Our earnings call to discuss the results of our second quarter of fiscal 2026 is scheduled for August 27. Well, now let me turn it over to Jensen to make some remarks.
Thanks, Colette. We've had a busy and productive year. Let me share my perspective on some topics we're frequently asked. On export control. China is one of the world's largest AI markets and a springboard to global success. With half of the world's AI researchers based there, the platform that wins China is positioned to lead globally. Today, however, the $50 billion China market is effectively closed to U.S. industry. The H20 export ban ended our hopper data center business in China. We cannot reduce hopper further to comply. As a result, we are taking a multibillion-dollar write-off on inventory that cannot be sold or repurposed. We are exploring limited ways to compete, but Hopper is no longer an option. China's AI moves on with or without U.S. chips. It has to compute to train and deploy advanced models. The question is not whether China will have AI, it already does. The question is whether one of the world's largest AR markets will run on American platforms. Shielding Chinese chipmakers from U.S. competition only strengthens them abroad and weakens America's position. Export restrictions have spurred China's innovation and scale.
The AI race is not just about chips. It's about which stack the world runs on. As that stack grows to include 6G and quantum, U.S. global infrastructure leadership is at stake. The U.S. has based its policy on the assumption that China cannot make AI chips. That assumption was always questionable and now it's clearly wrong. China has enormous manufacturing capability. In the end, the platform that wins the AI developers win AI -- wins AI. Export controls should strengthen U.S. platforms, not drive half of the world's AI talent to rivals.
On DeepSeek, DeepSeek and Q1 from China are among the most -- among the best open source models. Released freely, they've gained traction across the U.S., Europe and beyond. DeepSeek R1, like ChatGPT, introduced reasoning AI that produces better answers, the longer it thinks. Reasoning AI enables step-by-step problem solving, planning and tool use, turning models into intelligent agents.
Reasoning is compute-intensive, requires hundreds to thousands more thousands of times more tokens per task than previous one-shot inference. Reasoning models are driving a step-function surge in inference demand. AI scaling laws remain firmly intact, not only for training, but now Inference 2 requires massive scale compute. DeepSeek also underscores the strategic value of open source AI. When popular models are trained and optimized on U.S. platforms, it drives usage, feedback and continuous improvement, reinforcing American leadership across the stack.
U.S. platforms must remain the preferred platform for open source AI. That means supporting collaboration with top developers globally, including in China. America wins when models like DeepSeek and Q1 runs best on American infrastructure.
Regarding onshore manufacturing, President Trump has outlined a bold vision to reshore advanced manufacturing, create jobs and strengthen national security. Future plants will be highly computerized in robotics. We share this vision. TSMC is building 6 fabs and 2 advanced packaging plants in Arizona to make chips for NVIDIA. Process qualification is underway with volume production expected by year-end. SPIL and Amcor are also investing in Arizona, constructing packaging, assembly and test facilities.
In Houston, we're partnering with Foxconn to construct a 1 million square foot factory to build AI supercomputers. Wistron is building a similar plant in Fort Worth, Texas. To encourage and support these investments, we've made substantial long-term purchase commitments a deep investment in America's AI manufacturing future. Our goal from chip to supercomputer built in America within a year. Each GB200 NVLink72 racks contains 1.2 million components and weighs nearly 2 tons. No 1 has produced supercomputers on this scale. Our partners are doing an extraordinary job.
On AI diffusion rule, President Trump rescinded the AI diffusion rule, calling it counterproductive, and proposed a new policy to promote U.S. AI tech with trusted partners. On his Middle East tour, he announced historic investments. I was honored to join him in announcing a 500-megawatt AI infrastructure project in Saudi Arabia and a 5-gigawatt AI campus in the UAE. President Trump wants U.S. tech to lead. The deals he announced are wins for America, creating jobs, advancing infrastructure, generating tax revenue and reducing the U.S. trade deficit.
The U.S. will always be NVIDIA's largest market and home to the largest installed base of our infrastructure. Every nation now sees AI as core to the next industrial revolution, a new industry that produces intelligence and essential infrastructure for every economy. Countries are racing to build national AI platforms to elevate their digital capabilities. At Computex, we announced Taiwan's first AI factory in partnership with Foxconn and the Taiwan government.
Last week, I was in Sweden to launch its first national AI infrastructure. Japan, Korea, India, Canada, France, the U.K., Germany, Italy, Spain, and more are now building national AI factories to empower startups, industries and societies. Sovereign AI is a new growth engine for NVIDIA. Toshiya, back to you.
Operator, we will now open the call for questions. Would you please poll for questions?
[Operator Instructions] Your first question comes from the line of Joe Moore with Morgan Stanley.
2. Question Answer
You guys have talked about this scaling up of inference around reasoning models for at least a year now. And we've really seen that come to fruition as you talked about. We've heard it from your customers. Can you give us a sense for how much of that demand you're able to serve and give us a sense for maybe how big the inference business is for you guys? And do we need full on NDL72 rack scale solutions for reasoning inference going forward?
Well, we would like to serve all of it, and I think we're on track to serve most of it. Grace Blackwell NVLink72 is the ideal engine today, the ideal computer thinking machine, if you will, for reasoning AI. There's a couple of reasons for that. The first reason is that the token generation amount, the number of tokens reasoning goes through, is 100x, 1,000x more than a one-shot chatbot.
It's essentially thinking to itself, breaking down a problem step-by-step. It might be planning multiple paths to an answer. It could be using tools, reading PDFs, reading web pages, watching videos and then producing a result, an answer. The longer it thinks, the better the answer, the smarter the answer is. And so what we would like to do, and the reason why Grace Blackwell was designed to give such a giant step-up in inference performance, is so that you could do all this and still get a response as quickly as possible.
Compared to Hopper, Grace Blackwell is some 40x higher speed and throughput compared. And so this is going to be a huge, huge benefit in driving down the cost while improving the quality of response with excellent quality of service at the same time. So that's the fundamental reason. That was the core driving reason for Grace Blackwell NVLink 72. Of course, in order to do that, we had to reinvent, literally redesign, the entire -- a way that these supercomputers are built. But now we're in full production. It's going to be exciting. It's going to be incredibly exciting.
The next question comes from Vivek Arya with Bank of America Securities.
Just a clarification for Colette first. So on the China impact, I think previously, it was mentioned at about $15 billion, so you had the $8 billion in Q2. So is there still some left as a headwind for the remaining quarters just Colette, how to model that?
And then a question, Jensen, for you. Back at GTC, you had outlined a path towards almost $1 trillion of AI spending over the next few years. Where are we in that build-out? And do you think it's going to be uniform that you will see every spender, whether it's ESP, sovereigns, enterprises or build-out, should we expect some periods of digestion in between? Just what are your customer discussions telling you about how to model growth for next year?
Yes, Vivek. Thanks so much for the question regarding H20. Yes, we recognized $4.6 billion H20 in Q1. We were unable to ship $2.5 billion so the total for Q1 should have been $7 billion. When we look at our Q2, our Q2 is going to be meaningfully down in terms of China data center revenue. And we had highlighted in terms of the amount of orders that we had planned for H20 in Q2, and that was $8 billion.
Now going forward, we did have other orders going forward that we will not be able to fulfill. That is what was incorporated, therefore, in the amount that we wrote down of the $4.5 billion. That write-down was about inventory and purchase commitments, and our purchase commitments were about what we expected regarding the orders that we had received. Going forward, though, it's a bigger issue regarding the amount of the market that we will not be able to serve. We assess that TAM to be close to about $50 billion in the future as we don't have a product to enable for China.
In fact, the -- probably the best way to think through it is that AI is several things. Of course, we know that AI is this incredible technology that's going to transform every industry from, of course, the way we do software to health care and financial services to retail to, I guess, every industry, transportation, manufacturing. And we're at the beginning of that.
But maybe another way to think about that is where do we need intelligence, where do we need digital intelligence? And it's in every country, it's in every industry. And we know because of that, we recognize that AI is also an infrastructure. It's a way of developing a technology -- delivering a technology that requires factories and these factories produce tokens. And they, as I mentioned, are important to every single industry and every single country. And so on that basis, we're really at the very beginning of it because the adoption of this technology is really kind of in its early, early stages.
Now we've reached an extraordinary milestone with AIs that are reasoning or thinking, what people call inference time scaling. Of course, it created a whole new -- we've entered an era where inference is going to be a significant part of the compute workload. But anyhow, it's going to be a new infrastructure, and we're building it out in the cloud. The United States is really the early starter and available in U.S. clouds. And this is our largest market, our largest installed base and we continue to see that happening.
But beyond that, we're going to have to -- we're going to see AI go into enterprise, which is on-prem because so much of the data is still on-prem. Access control is really important. It's really hard to move all of every company's data into the cloud. And so we're going to move AI into the enterprise. And you saw that we announced a couple of really exciting new products, our RTX Pro Enterprise AI server that runs everything enterprise and AI, our DGX Spark and DGX Station, which is designed for developers who want to work on-prem. And so enterprise AI is just taking off.
Telcos. Today, a lot of the telco infrastructure will be, in the future, software defined and built on AI, and so 6G is going to be built on AI and that infrastructure needs to be built out. And I said, it's very, very early stages. And then, of course, every factory today that makes things will have an AI factory that sits with it. And the AI factory is going to be -- drive creating AI and operating AI for the factory itself but also to power the products and the things that are made by the factory. So it's very clear that every company will have AI factories.
And very soon, there'll be robotics companies, robot companies and those companies will be also building AIs to drive the robots. And so we're at the beginning of all of this build-out.
The next question comes from C.J. Muse with Cantor Fitzgerald.
There have been many large GPU cluster investment announcements in the last month, and you alluded to a few of them with Saudi Arabia, the UAE. And then also we heard from Oracle and xAI, just to name a few. So my question, are there other that have yet to be announced of the same kind of scale and magnitude? And perhaps more importantly, how are these orders impacting your lead times for Blackwell and your current visibility sitting here today almost halfway through 2025?
Well, we have more orders today than we did at the last time I spoke about orders at GTC. However, we're also increasing our supply chain and building out our supply chain. They're doing a fantastic job. We're building it here onshore in the United States. But we're going to keep our supply chain quite busy for several -- many more years coming.
And with respect to further announcements, I'm going to be on the road next week through Europe. And it's -- just about every country needs to build out AI infrastructure and their [ umpteenth ] AI factories being planned. We're -- I think in the remarks, Colette mentioned there's some 100 AI factories being built. There's a whole bunch that haven't been announced.
And I think the important concept here which makes it easier to understand is that like other technologies that impact literally every single industry, of course, electricity was one and it became infrastructure. Of course, the information infrastructure, which we now know as the Internet affects every single industry, every country, every society. Intelligence is surely one of those things. I don't know any company, industry, country who thinks that intelligence is optional. It's essential infrastructure.
And so we've now digitalized intelligence. And so I think we're clearly in the beginning of the build-out of this infrastructure. And every country will have it, I'm certain of that. Every industry will use it, that I'm certain of. And what's unique about this infrastructure is that it needs factories. It's a little bit like the energy infrastructure, electricity. It needs factories. We need factories to produce this intelligence, and the intelligence is getting more sophisticated.
We were talking about earlier that we had a huge breakthrough in the last couple of years with reasoning AI. And now there are agents that reason and there are super-agents that use a whole bunch of tools and then there's clusters of super agents where agents are working with agents, solving problems. And so you could just imagine, compared to one-shot chatbots and the agents that are now using AI built on these large language models, how much more compute-intensive they really need to be and are. So I think we're in the beginning of the build-out, and there should be many, many more announcements in the future.
Your next question comes from Ben Reitzes with Melius.
I wanted to ask, first to Colette, just a little clarification around the guidance and maybe putting it in a different way. The $8 billion for H20 just seems like it's roughly $3 billion more than most people thought with regard to what you'd be foregoing in the second quarter. So that would mean that with regard to your guidance, the rest of the business in order to hit [ 45 ] is doing $2 billion to $3 billion or so better. So I was wondering if that math made sense to you.
And then in terms of the guidance, that would imply the non-China business is doing a bit better than the Street expected. So wondering what the primary driver was there in your view. And then this second part of my question, Jensen, I know you guide 1 quarter at a time, but with regard to the AI diffusion rule being lifted and this momentum with sovereign, there's been times in your history where you guys have said on calls like this, where you have more conviction and sequential growth throughout the year, et cetera. And given the unleashing of demand with AI diffusion being revoked and the supply chain increasing, does the environment give you more conviction and sequential growth as we go throughout the year? So first 1 for Colette and then next 1 for Jensen.
Thanks, Ben, for the question. When we look at our Q2 guidance and our commentary that we provided, that had the export controls not occurred, we would have had orders of about $8 billion for H20, that's correct. That was a possibility for what we would have had in our outlook for this quarter in Q2. So what we also have talked about here is the growth that we've seen in Blackwell, Blackwell across many of our customers as well as the growth that we continue to have in terms of supply that we need for our customers. So putting those together, that's where we came through with the guidance that we provided. I'm going to turn the rest over to Jensen to see how he wants to...
Yes. Thanks. Thanks, Ben. I would say compared to the beginning of the year, compared to GTC time frame, there are 4 positive surprises. The first positive surprise is the step function demand increase of reasoning AI, I think it is fairly clear now that AI is going through an exponential growth, and reasoning AI really busted through. Concerns about hallucination or its ability to really solve problems, and I think a lot of people are crossing that barrier and realizing how incredibly effective agentic AI is and reasoning AI is. So number 1 is inference reasoning and the exponential growth there, demand growth.
The second one, you mentioned AI diffusion. It's really terrific to see that the AI diffusion rule was rescinded. President Trump wants America to win, and he also realizes that we're not the only country in the race. And he wants the United States to win and recognizes that we have to get the American stack out to the world and have the world build on top of American stacks instead of alternatives. And so AI diffusion happened, the rescinding of it happened at almost precisely the time that countries around the world are awakening to the importance of AI as an infrastructure, not just as a technology of great curiosity and great importance, but infrastructure for their industries and start-ups and society.
Just as they had to build out infrastructure for electricity and Internet, you got to build out an infrastructure for AI. I think that, that's an awakening, and that creates a lot of opportunity. The third is enterprise AI. Agents work and agents are doing -- these agents are really quite successful, much more than generative AI. Agentic AI is game-changing. Agents can understand ambiguous and rather implicit instructions and able to problem solve and use tools and have memory and so on. And so I think this is -- enterprise AI is ready to take off.
And it's taken us a few years to build a computing system that is able to integrate and run enterprise AI stacks, run enterprise IT stacks but add AI to it. And this is the RTX Pro Enterprise server that we announced at Computex just last week. And just about every major IT company has joined us, super excited about that. And so computing is 1 stack, 1 part of it. But remember, enterprise IT is really 3 pillars: it's compute, storage, and networking. And we've now put all 3 of them together for finally, and we're going to market with that.
And then lastly, industrial AI. Remember, one of the implications of the world reordering, if you will, is a region's onshoring manufacturing and building plants everywhere. In addition to AI factories, of course, there are new electronics manufacturing, chip manufacturing being built around the world. And all of these new plants in these new factories are creating exactly the right time when Omniverse and AI and all the work that we're doing with robotics is emerging.
And so this fourth pillar is quite important. Every factory will have an AI factory associated with it. And in order to create these physical AI systems, you really have to train a vast amount of data. So back to more data, more training, more AIs to be created, more computers. And so these 4 drivers are really kicking into turbocharge.
Your next question comes from Timothy Arcuri with UBS.
Jensen, I wanted to ask about China. It sounds like the July guidance assumes there's no SKU replacement for the age 20. But if the President wants the U.S. to win, it seems like you're going to have to be allowed to ship something into China. So I guess I had 2 points on that. First of all, have you been approved to ship a new modified version into China? And you're currently building it but you just can't ship it in fiscal Q2?
And then you were sort of run rating $7 billion to $8 billion a quarter into China. Can we get back to those sorts of quarterly run rates once you get something that you're allowed to ship back into China? I think we're all trying to figure out how much to add back to our models and when. So whatever you can say there would be great.
The President has a plan. He has a vision and I trust him. With respect to our export controls, it's a set of limits. And the new set of limits pretty much make it impossible for us to reduce hopper any further for any productive use. And so the new limits, it's kind of the end of the road for Hopper. We have some -- we have limited options. And so we just -- the key is to understand the limits. The key is to understand the limits and see if we can come up with interesting products that could continue to serve the Chinese market.
We don't have anything at the moment, but we're considering it. We're thinking about it. Obviously, the limits are quite stringent at the moment. And we have nothing to announce today. And when the time comes, we'll engage the administration and discuss that.
Your final question comes from the line of Aaron Rakers with Wells Fargo.
This is Jake on for Aaron. I was wondering if you could give some additional color around the strength you saw within the Networking business, particularly around the adoption of your Ethernet solutions at CSPs as well as any change you're seeing in network attach rates.
Yes, thank you for that. We now have 3 networking platforms, maybe 4. The first 1 is the scale-up platform to turn a computer into a much larger computer. Scaling up is incredibly hard to do. Scaling out is easier to do but scaling up is hard to do. And that platform is called NVLink. NVLink is -- comes with it chips and switches and NVLink spines and it's really complicated. But anyway, that's our new platform, scale-up platform.
In addition to InfiniBand, we also have Spectrum-X. We've been fairly consistent that Ethernet was designed for a lot of traffic that are independent. But in the case of AI, you have a lot of computers working together. And the traffic of AI is insanely bursty. Latency matters a lot because the AI is thinking and it wants to get work on as quickly as possible, and you got a whole bunch of nodes working together. And so we enhanced Ethernet, added capabilities like extremely low latency, congestion control, adaptive routing, the type of technologies that were available only in InfiniBand to Ethernet.
And as a result, we improved the utilization of Ethernet in these clusters. These clusters are gigantic, from as low as 50% to as high as 85%, 90%. And so the difference is if you had a cluster that's $10 billion and you improve its effectiveness by 40%, that's worth $4 billion. It's incredible. And so Spectrum-X has been really, quite frankly, a home run. And this last quarter, as we said in the prepared remarks, we added 2 very significant CSPs to the Spectrum-X adoption.
And then the last 1 is BlueField, which is our control plane. And so in those 4 -- those -- the control plane network, which is used for storage. It's used for security and for many of these clusters that want to achieve isolation among its users, multi-tenant clusters and still be able to use and have extremely high-performance bare metal performance, BlueField is ideal for that and is used in a lot of these cases. And so we have these 4 networking platforms that are all growing and we're doing really well. I'm very proud of the team.
That is all the time we have for questions. Jensen, I will turn the call back to you.
Thank you. This is the start of a powerful new wave of growth. Grace Blackwell is in full production. We're off to the races. We now have multiple significant growth engines. Inference, once the light workload is surging with revenue-generating AI services. AI is growing faster and will be larger than any platform shifts before, including the Internet, mobile and cloud.
Blackwell is built to power the full AI life cycle from training frontier models to running complex inference and reasoning agents at scale. Training demand continues to rise with breakthroughs in post training and like reinforcement learning and synthetic data generation. But inference is exploding. Reasoning AI agents require orders of magnitude more compute.
But foundations of our next growth platforms are in place and ready to scale. Sovereign AI, nations are investing in AI infrastructure. They for electricity and Internet. Enterprise AI, AI must be deployable on prem and integrated with existing IT. Our RTX Pro, DGX Park and DGX Station enterprise AI systems are ready to modernize the $500 billion IT infrastructure on-prem or in the cloud. Every major IT provider is partnering with us.
Industrial AI from training to digital twin simulation to deployment, NVIDIA Omniverse and Isaac are powering next-generation factories and humanoid robotic systems worldwide. The age of AI is here from AI infrastructures, inference at scale, sovereign AI, enterprise AI, and industrial AI, NVIDIA is ready.
Join us at GTC Paris, our keynote at VivaTech on June 11, talking about quantum GPU computing, robotic factories and robots and celebrate our partnerships building AI factories across the region. The NVIDIA band will tour France, the U.K., Germany, and Belgium. Thank you for joining us at the earnings call today. See you in Paris.
This concludes today's conference call. You may now disconnect.
Transkripte auf Deutsch freischalten
- Alle Event Transkripte auf Deutsch
- Sofortige Übersetzung
- KI-Zusammenfassungen für die wichtigsten Insights
NVIDIA — Q1 2026 Earnings Call
NVIDIA — Q1 2026 Earnings Call
📊 Quartal auf einen Blick
- Umsatz: $44 Mrd. (+69% Jahr‑zu‑Jahr)
- Data Center: $39 Mrd. (+73% Jahr‑zu‑Jahr)
- H20‑Effekt: $4,6 Mrd. Umsatz erkannt; $4,5 Mrd. Abschreibung auf Vorräte und Purchase‑Obligationen
- Margen: GAAP 60,5% / Non‑GAAP 61%; ohne H20‑Charge Non‑GAAP 71,3%
- Kapitalrückfluss: $14,3 Mrd. an Aktienrückkäufen und Dividenden
🎯 Was das Management sagt
- Blackwell‑Ramp: Schnellster Produkt‑Ramp in der Firmenhistorie; Blackwell lieferte ~70% des Data‑Center‑Compute in Q1 und treibt Inference‑Skalierung
- Produktroadmap: GB200 NVL (Architekturwende) ist GA; GB300‑Sampling läuft, Produktionsstart später im Quartal; Fokus auf Drop‑in‑Kompatibilität
- Strategie & Fertigung: Ausbau von "AI factories", Sovereign‑AI und Onshore‑Fertigung (US‑Fabriken mit Partnern) zur Sicherung Supply und Marktanteile
🔭 Ausblick & Guidance
- Q2‑Umsatz: Erwartet $45 Mrd. ±2%
- China‑Impact: Prognostiziertes verlorenes H20‑Umsatzvolumen ~ $8 Mrd. in Q2; Management prüft begrenzte, regelkonforme Alternativen
- Margen & Opex: GAAP/Non‑GAAP Margen ~71,8%/72% ±50 Basispunkte; GAAP Opex ≈ $5,7 Mrd., Non‑GAAP Opex ≈ $4 Mrd.; Ziel: Mid‑70s Margen später im Jahr
❓ Fragen der Analysten
- China & H20: Kritische Nachfrage zur Modellierung des China‑Verlusts; Management hat kurzfristig kein neues, zulassungsfähiges SKU und nennt nur begrenzte Optionen
- Inference‑Nachfrage: Analysten fragten, wie viel Nachfrage NVIDIA bedienen kann; Antwort: Blackwell NVLink72 ist konzipiert, "so viel wie möglich" zu bedienen
- Supply & Aufträge: Fragen zu Lead‑Times und großen Sovereign‑Deals; Management meldet mehr Aufträge als vor GTC, Ausbau der Fertigung onshore zur Kapazitätserhöhung
⚡ Bottom Line
- Fazit: Sehr starkes Wachstum getrieben von Blackwell‑Rampen und Inference‑Boom, aber kurzfristige Volatilität durch US‑Exportkontrollen auf H20 (große Abschreibung und China‑Headwind). Langfristige Nachfrage, Roadmap und Kapitalrückflüsse bleiben robust, kurzfristig erhöhte Unsicherheit in China.
Finanzdaten von NVIDIA
Umsatz
Der Umsatz stellt die Summe aller Einnahmen eines Unternehmens z. B. für dessen Produkte oder Dienstleistungen dar.
Umsatz (TTM) einfach erklärtDirekte Kosten
Direkte Kosten sind die Kosten, die direkt im Zusammenhang mit der Herstellung des Produkts oder der Dienstleistung entstehen.
Bruttoertrag
Der Bruttoertrag gibt an, wie viel vom Umsatz nach Abzug der direkten Herstellkosten im Unternehmen verbleibt. Berechnet man den prozentualen Anteil vom Umsatz, spricht man von der Bruttomarge (engl. Gross Margin).
Brutto Marge einfach erklärtVertriebs- und Verwaltungskosten
Die Vertriebs- & Verwaltungskosten (engl. Selling, General & Administrative expenses, kurz SG&A) beinhalten alle Aufwände für Marketing und den Verkauf sowie die allgemeine Verwaltung des Unternehmens.
Forschungs- und Entwicklungskosten
Die Forschungs- und Entwicklungskosten (engl. research & development costs, kurz R&D) geben Auskunft darüber, wie viel das Unternehmen in die Forschung und die Entwicklung seiner Produkte investiert. Vor allem prozentual vom Umsatz und im Vergleich zu direkten Wettbewerbern sind die Kosten interessant.
EBITDA
Das EBITDA (Earnings Before Interest, Taxes, Depreciation and Amortization) ist der Gewinn des Unternehmens vor Zinsen, Steuern und Abschreibungen. Berechnet man den prozentualen Anteil vom Umsatz, spricht man von der EBITDA-Marge.
Abschreibungen
Abschreibungen stellen Wertminderungen von Vermögensgegenständen des Unternehmens dar (z.B. durch Abnutzung von Maschinen).
EBIT (Operatives Ergebnis)
Das EBIT (engl. Earnings Before Interest and Taxes) ist der Gewinn des Unternehmens vor Zinsen und Steuern, das auch als operatives Ergebnis bezeichnet wird. Berechnet man den prozentualen Anteil vom Umsatz, spricht man von
der EBIT-Marge.
Nettogewinn
Der Nettogewinn stellt den Gewinn oder Verlust nach Abzug aller Kosten dar.
Nettogewinn einfach erklärtaktien.guide Basis
| Apr '26 |
+/-
%
|
||
| Umsatz | 253.491 253.491 |
71 %
71 %
100 %
|
|
| - Direkte Kosten | 65.539 65.539 |
48 %
48 %
26 %
|
|
| Bruttoertrag | 187.952 187.952 |
81 %
81 %
74 %
|
|
| - Vertriebs- und Verwaltungskosten | 4.838 4.838 |
29 %
29 %
2 %
|
|
| - Forschungs- und Entwicklungskosten | 20.829 20.829 |
47 %
47 %
8 %
|
|
| EBITDA | 165.514 165.514 |
88 %
88 %
65 %
|
|
| - Abschreibungen | 3.229 3.229 |
56 %
56 %
1 %
|
|
| EBIT (Operatives Ergebnis) EBIT | 162.285 162.285 |
88 %
88 %
64 %
|
|
| Nettogewinn | 159.613 159.613 |
108 %
108 %
63 %
|
|
Angaben in Millionen USD.
Nichts mehr verpassen! Wir senden Dir alle News zur NVIDIA-Aktie direkt und kostenlos in Deine Mailbox.
Auf Wunsch erhältst Du jeden Morgen pünktlich zum Frühstück eine E-Mail, die alle für Dich relevanten Aktien-News enthält.
NVIDIA Aktie News
Firmenprofil
Die NVIDIA Corporation ist ein Unternehmen für die komplette Computerinfrastruktur. Das Unternehmen optimiert die Datenverarbeitung, um die Rechenleistung zu erhöhen und somit die Rechenprobleme zu lösen. Das Unternehmen ist in zwei Hauptsegmenten tätig: Compute & Networking und Graphics. Das Segment Compute & Networking umfasst die beschleunigte Computing-Plattform für Rechenzentren, die Vernetzung, die künstliche Intelligenz (KI) im Automobilbereich, das Cockpit, Vereinbarungen zur Entwicklung autonomer Fahrzeuge und Lösungen für autonome Fahrzeuge, Computerplattformen für Elektrofahrzeuge, NVIDIA AI Enterprise und andere Software. Das Segment Graphics umfasst GeForce-GPUs für Spiele und PCs, den Game-Streaming-Dienst GeForce NOW und die zugehörige Infrastruktur sowie Lösungen für Gaming-Plattformen. Hinzu kommen Quadro/NVIDIA RTX-GPUs für Enterprise-Workstation-Grafiken. Das Produktportfolio umfasst Virtual GPU (vGPU), Software für Cloud-basiertes visuelles und virtuelles Computing, Automobilplattformen für Infotainment-Systeme sowie die Unternehmenssoftware Omniverse für die Erstellung und den Betrieb metaverser und dreidimensionaler Internetanwendungen. Das Unternehmen wurde im Januar 1993 von Jen-Hsun Huang, Chris A. Malachowsky und Curtis R. Priem gegründet und hat seinen Hauptsitz in Santa Clara, Kalifornien.
aktien.guide Basis
| Hauptsitz | USA |
| CEO | Mr. Huang |
| Mitarbeiter | 42.000 |
| Gegründet | 1993 |
| Webseite | www.nvidia.com |


