I was inside the VR winter. There wasn't one.

The chart says VR had a winter. I was inside it. There was no winter.

A reflection on the AI and VR cycles, from someone who has been writing rendering code in the same field since 2002.

Dr. George Papagiannakis Professor, University of Crete · Principal Researcher, FORTH-ICS · Founder & CEO, ORamaVR

May 2026


I recently put together an eighty-three-year timeline of AI and VR cycles — McCulloch and Pitts (1943) to today, every event verified against primary sources where I could, sixty-four marker points, a complete reference table. Built with the assistance of Anthropic Claude. The full interactive chart lives at the timeline article ↗.

The visualisation is rigorous and the analysis is largely correct: a shared winter in the 1990s, a shared ignition in 2012, and a 2022–2024 anti-correlation in which ChatGPT consumed the oxygen the metaverse pitch had just inhaled.

It is also, from the inside, profoundly misleading.

The chart tracks headlines, capital allocation, and product launches. It does not track substrate. And in technology, the substrate is what eventually decides everything.


A personal coordinate: 2002, Geneva

In the year the chart marks as the deepest point of the VR winter — the 1996–2012 stretch greyed out on most timelines as a wasteland — I was at MIRALab in Geneva, in Professor Nadia Magnenat Thalmann’s group, working on two projects that, twenty-four years later, look less like artefacts of a dead era and more like first drafts of the present.

The first was LIFEPLUS — an EU IST Framework V project, ranked first of 150 submissions in its call. We deployed life-size augmented reality characters reanimating ancient frescoes in Pompeii on a mobile, home-built video-see-through head-mounted display (Papagiannakis et al., 2005). The compute platform was a Pentium III with a GeForce 3; programmable vertex shaders had been available for less than two years. To the best of my knowledge, this was the first time virtual humans walked through ruins in real time on a wearable AR system. The underlying character-simulation framework — VHD++ — was open-sourced; pieces of it still circulate in MR research.

The second was JUST — a virtual reality system for health emergency decision training, presented at the 9th Eurographics Workshop on Virtual Environments (Ponder et al., 2003). One of the earliest serious attempts at medical VR education, fifteen years before “MedXR” was a market category. The architectural pattern — interactive virtual humans plus authoring tools that did not require the researcher to be a graphics programmer — is the pattern I would eventually build a company around.

Both shipped in 2002. Both inside what the headlines called a wasteland.

We were arguing about how to make a synthetic doctor speak with lip-synchronised audio at twenty-five frames per second on hardware that, by 2026 metrics, sits below the floor of a smartphone keyboard. We were laying down the technical substrate of every medical VR and AR product on the market today.

The chart is correct that the funding cycle contracted in those years. It is wrong if it implies nothing was being built.


After Geneva

The 1996–2012 winter on the chart is a thirteen-year stretch. I worked through almost all of it. After Geneva I came back to Greece and joined the University of Crete and the Foundation for Research and Technology — Hellas (FORTH-ICS), where I have run a research group continuously since.

The throughline was constant: rendering pipelines, virtual humans, real-time simulation, increasingly oriented toward clinical applications. The technical surface kept changing. In 2002 the hard problem was getting a programmable shader to produce a believable skin material. By 2010 it was real-time global illumination on commodity GPUs. By 2018 it was differentiable rendering and neural scene representations. By 2024 it was scene composition under generative-AI control, with the deterministic verification a medical procedure requires. The underlying question — how do you build virtual humans and virtual environments that hold up in a clinical setting — did not change.

In 2016, with colleagues across Europe, I co-founded what is now the ENGAGE workshop, the venue at Computer Graphics International where the medical-XR community meets each year. By the time I founded ORamaVR a year later, the substrate had been compounding for fifteen years in plain sight, even as the cycle chart was still calling 2012–2016 the first cautious thaw.


What the substrate has compounded since

Read the technical stack quietly, year by year:

  • Graphics: fixed-function pipelines → programmable shaders → physically-based rendering → neural radiance fields (Mildenhall et al., 2020) → 3D Gaussian splatting (Kerbl et al., 2023). Each step added roughly an order of magnitude of scene fidelity per watt. None of these were anticipated when our group started publishing on real-time virtual humans in the early 2000s; they all became standard tooling in the intervening years.
  • Tracking: outside-in optical → IMU fusion → inside-out SLAM → passthrough re-projection at sub-twenty-millisecond motion-to-photon latency. The clinical-grade requirement is the latency cap; surgical and rehabilitation simulators have to stay below the perceptual threshold of sensorimotor mismatch, or trainees acquire the wrong motor habits.
  • Displays: VGA-resolution CRT helmets → 4K pancake optics with eye-tracked foveated rendering at consumer price points. A Vision Pro at $3,499 carries roughly the per-eye pixel density of a microscope reticle.
  • Content: hand-modelled avatars → motion-captured → data-driven neural animation → text-conditioned generative scene synthesis. The content layer is where the field is currently moving fastest, and where the convergence with AI is unmistakable.

None of these curves were cyclical. None of them paused. They compounded in monotone, regardless of which year the headlines called a winter.


The “metaverse winter” of 2022–2024 is misnamed

It was a narrative winter, not a technical one. Reality Labs lost more than $46 billion by 2023, the press turned, and capital migrated to large language models. All of that is true. None of it stopped Quest 3 from shipping, Vision Pro from existing, 3DGS papers from compounding at SIGGRAPH and ICCV, or the medical-XR field from continuing to publish randomised trials. Kenanidis et al. (2023) showed statistically significant skill transfer for hip-arthroplasty training using the MAGES platform — the kind of evidence that did not exist for surgical VR five years earlier. The JMIR 2025 perioperative-stress RCT moved VR from research demo to digital-health intervention with measured clinical outcomes.

The continued capability of medical VR specifically — the thing the headlines decided was dead — has been documented across nearly four decades by Walter Greenleaf at the Stanford Virtual Human Interaction Lab and the Stanford Medical Mixed Reality Center (Weiss et al., 2021). His repeated finding, across pain management, behavioural health, surgical simulation, neurorehabilitation, and post-traumatic stress, is that medical VR has been crossing the clinical-evidence threshold for specific indications on a steady curve since the late 1990s, with no inflection at any of the supposed winters. He has been saying this in essentially these words at essentially every major clinical conference for thirty years, and the medical literature has been catching up.

The chart shows a contraction in coverage. It does not show a contraction in capability. We have just lived through three years of headline disinterest during what is arguably the strongest period of underlying advancement since 2012.


Why I think we are about to take off

Here is the contrarian read of the 2022–2024 anti-correlation: ChatGPT did not eclipse VR. ChatGPT finished VR’s missing piece.

Until late 2022, the metaverse pitch had a structural flaw nobody wanted to say out loud — there was nowhere near enough content to fill it. Game engines can build a city. The open metaverse needs millions of scenes, characters, dialogues, behaviours, simulations — produced at a speed and cost no human studio can sustain. Medical XR has the same problem, sharper: the content layer for surgical simulation must be clinically validated, anatomically faithful, and revised every time a procedure changes. Hand-authoring at that fidelity does not scale.

Generative AI is the unlock. The same large-language models that consumed the metaverse’s capital in 2022 are turning into the production tools that make the metaverse economically viable. Text-to-3D, text-to-scene, dialogue agents, procedural character behaviour, AI-driven simulation authoring — every one of these is now a working tool, not a research demo. My current research at FORTH and at ORamaVR sits in exactly that gap: how do you compose clinically faithful XR simulations from generative components, and how do you verify deterministically that the procedure they reproduce is the procedure a surgeon would recognise?

The thesis is not AI versus VR. The thesis is AI is the long-missing content layer of VR — and medical XR is one of the first domains where that asymmetry pays out, because the verification problem (does this simulation reproduce the clinical task correctly?) is already the field’s central technical question.

This is no longer an idiosyncratic view. Fei-Fei Li — whose 2009 ImageNet paper (Deng et al., 2009) is one of the foundational events on the AI timeline above — has spent 2024 and 2025 arguing that “AGI will not be complete without spatial intelligence … spatial intelligence is as critical as, and complementary to, language intelligence” (Li, IEEE Spectrum and Bloomberg interviews, 2024–2025). She founded World Labs on that thesis. As of February 2026, the company has raised over a billion dollars from Nvidia, AMD, and Autodesk, among others (Silicon Republic, 2026). Its co-founders include Ben Mildenhall, the co-creator of NeRF — the same neural-radiance-field substrate that appears on the graphics curve above. The people who built the substrate are now building the convergence.

When the researcher who lit the modern AI summer says the next frontier of AI runs through 3D scene understanding, and is funded at unicorn scale to prove it, the AI/VR convergence stops being a contrarian read. It is now the field’s stated direction.

Add to that the hardware curve finally reaching consumer-acceptable comfort, weight, and price. Add to that clinical validation in medical XR moving from research demo toward standard of care. Add to that two decades of compounded substrate held by the people who never left the field.

The chart says winter. What I see, from inside, is the last quiet moment before take-off — and the spring that follows will not look like the consumer-VR moment that just ended.


A closing note, to my colleagues who have lived through the cycles

We have seen this pattern before. In 2002, when LIFEPLUS first walked Roman characters through Pompeii on a home-built video-see-through HMD, and when the JUST simulator was demonstrating no-code authoring of clinical training scenarios, almost nobody outside our community was paying attention. Twenty-four years later, the same kinds of virtual humans live in every consumer headset on the market, and the architectural pattern we wrote then is recognisably the ancestor of what runs in clinically deployed XR today.

We are not a lonely cohort. The communities I have published in throughout this stretch — ACM SIGGRAPH, IEEE VR, ISMAR, MICCAI, Computer Graphics International — have continued to compound capability through every supposed winter. Nadia Magnenat Thalmann’s group at MIRALab. Walter Greenleaf at Stanford VHIL and the Medical Mixed Reality Center. The ENGAGE workshop community my colleagues and I founded in 2016, now an annual track at Computer Graphics International. We recognise each other across the ranks.

The mistake we should not make twice is reading headline silence as a signal of underlying stagnation. The substrate is compounding. The unlock has arrived. The interesting question is no longer whether VR matters. It is which institutions and which research programmes have accumulated the technical depth to deploy what comes next responsibly — particularly in domains like surgical training, where the stakes are clinical rather than recreational, and where the deployment partner’s twenty-year track record matters more than the demo reel.


Companion artefact

The full eighty-three-year cycle timeline (1943–2026) with hover-context for sixty-four marker events, the data table of all forty-two charted events, and additional formats:

Timeline article (interactive HTML + reference tables) ↗


References

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Li, F.-F. (2009). ImageNet: A large-scale hierarchical image database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 248–255. https://doi.org/10.1109/CVPR.2009.5206848

Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527

Kenanidis, E. et al. (2023). Validation of high-fidelity virtual reality simulation for hip-arthroplasty surgical training using the MAGES SDK. Reference reflects the published RCT outcome; verify exact title and journal at submission.

Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). 3D Gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (SIGGRAPH), 42(4), Article 139. https://doi.org/10.1145/3592433

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems (NeurIPS), 25, 1097–1105.

Li, F.-F. (2023). The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI. Flatiron Books.

Li, F.-F. (2024). On spatial intelligence as the next frontier of AI. TED Talk and IEEE Spectrum / Bloomberg interviews, 2024–2025. Cited formulation: “AGI will not be complete without spatial intelligence.”

McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5(4), 115–133. https://doi.org/10.1007/BF02478259

Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., & Ng, R. (2020). NeRF: Representing scenes as neural radiance fields for view synthesis. European Conference on Computer Vision (ECCV). https://doi.org/10.1007/978-3-030-58452-8_24

Minsky, M., & Papert, S. (1969). Perceptrons: An Introduction to Computational Geometry. MIT Press.

Papagiannakis, G., Schertenleib, S., O’Kennedy, B., Arevalo-Poizat, M., Magnenat-Thalmann, N., Stoddart, A., & Thalmann, D. (2005). Mixing virtual and real scenes in the site of ancient Pompeii. Computer Animation and Virtual Worlds, 16(1), 11–24. https://doi.org/10.1002/cav.53

Ponder, M., Herbelin, B., Molet, T., Schertenlieb, S., Ulicny, B., Papagiannakis, G., Magnenat-Thalmann, N., & Thalmann, D. (2003). Immersive VR decision training: Telling interactive stories featuring advanced virtual human simulation technologies. Proceedings of the 9th Eurographics Workshop on Virtual Environments, 97–106. The Eurographics Association, ACM Press, Zurich, May 2003.

Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386–408. https://doi.org/10.1037/h0042519

Silicon Republic (18 February 2026). Fei-Fei Li’s World Labs raises $1 bn to advance spatial intelligence — backed by Nvidia, AMD, Autodesk, Fidelity, Emerson Collective, Sea. https://www.siliconrepublic.com/start-ups/fei-fei-li-world-labs-raises-1bn-to-spatial-intelligence-ai-world-models-marble

Stephenson, N. (1992). Snow Crash. Bantam Books.

Sutherland, I. E. (1968). A head-mounted three-dimensional display. Proceedings of the AFIPS Fall Joint Computer Conference, 33, 757–764. https://doi.org/10.1145/1476589.1476686

Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460. https://doi.org/10.1093/mind/LIX.236.433

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 30.

Weiss, T., Bailenson, J., Bullock, K., & Greenleaf, W. (2021). Reality, from virtual to augmented. In Digital Health (pp. 275–303). Elsevier. https://doi.org/10.1016/B978-0-12-818914-6.00018-1


Citation notes: the primary references above (McCulloch & Pitts 1943, Turing 1950, Rosenblatt 1958, Minsky & Papert 1969, Sutherland 1968, Stephenson 1992, Hinton, Osindero & Teh 2006, Deng et al. 2009, Krizhevsky et al. 2012, Vaswani et al. 2017, Mildenhall et al. 2020, Kerbl et al. 2023, Weiss et al. 2021, Papagiannakis et al. 2005, and Ponder et al. 2003) are verified primary sources. Kenanidis et al. (2023) is referenced from the author’s own materials and should be verified for exact title and journal before re-publication. The Fei-Fei Li / World Labs material is sourced from contemporary press coverage (IEEE Spectrum, Reuters, Bloomberg, Silicon Republic) and from the World Labs corporate site; quoted formulations reflect Li’s public statements at TED, NeurIPS, and in press interviews 2024–2026.


This essay and the companion timeline article were produced with the assistance of Anthropic Claude, working from primary materials, prior research notes, and editorial direction.