
The chart says VR had a winter. I was inside it. There was no winter.
A reflection on the AI and VR cycles, from someone who has been writing rendering code in the same field since 2002.
Dr. George Papagiannakis Professor, University of Crete · Principal Researcher, FORTH-ICS · Founder & CEO, ORamaVR
May 2026 — v3
I recently put together an eighty-three-year timeline of AI and VR cycles — McCulloch and Pitts (1943) to today, every event verified against primary sources where I could, sixty-four marker points, a complete reference table. Built with the assistance of Anthropic Claude. The full interactive chart lives at the timeline article ↗.
The visualisation is rigorous and the analysis is largely correct: a shared winter in the 1990s, a shared ignition in 2012, and a 2022–2024 anti-correlation in which ChatGPT consumed the oxygen the metaverse pitch had just inhaled.
It is also, from the inside, profoundly misleading.
The chart tracks headlines, capital allocation, and product launches. It does not track substrate. And in technology, the substrate is what eventually decides everything.
A personal coordinate: 2002, Geneva
In the year the chart marks as the deepest point of the VR winter — the 1996–2012 stretch greyed out on most timelines as a wasteland — I was at MIRALab in Geneva, in Professor Nadia Magnenat Thalmann’s group, working on two projects that, twenty-four years later, look less like artefacts of a dead era and more like first drafts of the present.
The first was LIFEPLUS — an EU IST Framework V project, ranked first of 150 submissions in its call. We deployed life-size augmented reality characters reanimating ancient frescoes in Pompeii on a mobile, home-built video-see-through head-mounted display and real-time, markerless SLAM based AR camera tracking (Papagiannakis et al., 2005). The compute platform was a Pentium III with a GeForce 3; programmable vertex shaders had been available for less than two years. To the best of my knowledge, this was the first time virtual humans walked through ruins in real time on a wearable AR system. The underlying character-simulation framework — VHD++ — was open-sourced; pieces of it still circulate in MR research.
The second was JUST — a virtual reality system for health emergency decision training, presented at the 9th Eurographics Workshop on Virtual Environments (Ponder et al., 2003). One of the earliest serious attempts at medical VR education, fifteen years before “MedXR” was a market category. The architectural pattern — interactive virtual humans plus authoring tools that did not require the researcher to be a graphics programmer — is the pattern I would eventually build a company around.
Both shipped in 2002. Both inside what the headlines called a wasteland.
We were arguing about how to make a synthetic doctor speak with lip-synchronised audio at twenty-five frames per second on hardware that, by 2026 metrics, sits below the floor of a smartphone keyboard. We were laying down the technical substrate of every medical VR and AR product on the market today.
The chart is correct that the funding cycle contracted in those years. It is wrong if it implies nothing was being built.
After Geneva
The 1996–2012 winter on the chart is a thirteen-year stretch. I worked through almost all of it. After Geneva I came back to Greece and joined the University of Crete and the Foundation for Research and Technology — Hellas (FORTH-ICS), where I have run a research group continuously since. In 2011 I returned to FORTH on a Marie Skłodowska-Curie Intra-European Fellowship — HiFi-PRINTER, on high-fidelity rendering of interactive characters — at exactly the year the chart calls the bottom of the curve.
The throughline was constant: rendering pipelines, virtual humans, real-time simulation, increasingly oriented toward clinical applications. The technical surface kept changing. In 2002 the hard problem was getting a programmable shader to produce a believable skin material. By 2010 it was real-time global illumination on commodity GPUs. By 2018 it was geometric algebra rendering and GPU-powered representations. By 2024 it was scene composition under generative-AI control, with the deterministic verification a medical procedure requires. The underlying question — how do you build virtual humans and virtual environments that hold up in a clinical setting — did not change.
The substrate bet I made in this stretch, and the one I am still making, was on Geometric Algebra (GA) as a unified mathematical layer for character animation, rendering, and — now — generative scene composition. Conformal GA represents rotation, translation, and uniform scaling as a single kind of object (a motor), which collapses the bookkeeping that a graphics engine usually carries across quaternions, dual-quaternions, and 4×4 matrices. In 2016, with Margarita Papaefthymiou and Dietmar Hildenbrand, we published the first GPU-resident CGA pipeline for animation interpolation and skinning of deformable characters (Papaefthymiou et al., 2016). At the same Computer Graphics International conference in Heraklion that year, we co-founded the ENGAGE workshop on geometric algebra for graphics and engineering, which has run annually ever since under CGS/CGI auspices. The same algebraic substrate later carried over to real-time cut/tear/drill of soft bodies for surgical simulation, to networked VR state synchronisation, and most recently to LLM-driven 3D scene editing where CGA motors give the language model a closed, composable grammar of spatial operations (Kolyvakis et al., 2025).
Alongside the algebra, a second line ran through the same years: an open, shader-based teaching framework — glGA, presented at Eurographics 2014 — that ten years later became pyGANDALF, awarded Best Education Paper at ACM SIGGRAPH Asia 2024 (Petropoulos et al., 2024). Same pedagogical premise — write the smallest readable system that exposes the modern graphics pipeline end-to-end — ported forward through a decade of API churn.
In 2020, with the substrate by then compounded for nearly two decades, we incorporated ORamaVR SA in Geneva — back to the city where LIFEPLUS and JUST had shipped eighteen years earlier — to take this work into clinical deployment. The chart was still calling the years around 2016 the first cautious thaw. From inside the field, nothing about that decade had felt cautious; the work had been continuous and the technical depth had been accumulating in plain sight.
What the substrate has compounded since
Read the technical stack quietly, year by year:
- Graphics: fixed-function pipelines → programmable shaders → physically-based rendering → neural radiance fields (Mildenhall et al., 2020) → 3D Gaussian splatting (Kerbl et al., 2023). Each step added roughly an order of magnitude of scene fidelity per watt. None of these were anticipated when our group started publishing on real-time virtual humans in the early 2000s; they all became standard tooling in the intervening years.
- Tracking: outside-in optical → IMU fusion → inside-out SLAM → passthrough re-projection at sub-twenty-millisecond motion-to-photon latency. The clinical-grade requirement is the latency cap; surgical and rehabilitation simulators have to stay below the perceptual threshold of sensorimotor mismatch, or trainees acquire the wrong motor habits.
- Displays: VGA-resolution CRT helmets → 4K pancake optics with eye-tracked foveated rendering at consumer price points with the per-eye pixel density of a microscope reticle.
- Content: hand-modelled avatars → motion-captured → data-driven neural animation → text-conditioned generative synthetic scene synthesis with synthetic embodied agents. The content layer is where the field is currently moving fastest, and where the convergence with agentic AI is unmistakable.
None of these curves were cyclical. None of them paused. They compounded in monotone, regardless of which year the headlines called a winter.
The “metaverse winter” of 2022–2024 is misnamed
It was a narrative winter, not a technical one. Reality Labs lost more than $46 billion by 2023, the press turned, and capital migrated to large language models. All of that is true. None of it stopped Quest 3 from shipping, Vision Pro from existing, 3DGS papers from compounding at SIGGRAPH and ICCV, or the medical-XR field from continuing to publish randomised trials. Kenanidis et al. (2023) showed statistically significant skill transfer for hip-arthroplasty training using the MAGES platform — the kind of evidence that did not exist for surgical VR five years earlier. The JMIR 2025 perioperative-stress RCT moved VR from research demo to digital-health intervention with measured clinical outcomes.
The continued capability of medical VR specifically — the thing the headlines decided was dead — has been documented across nearly four decades by Walter Greenleaf at the Stanford Virtual Human Interaction Lab and the Stanford Medical Mixed Reality Center (Weiss et al., 2021). His repeated finding, across pain management, behavioural health, surgical simulation, neurorehabilitation, and post-traumatic stress, is that medical VR has been crossing the clinical-evidence threshold for specific indications on a steady curve since the late 1990s, with no inflection at any of the supposed winters. He has been saying this in essentially these words at essentially every major clinical conference for thirty years, and the medical literature has been catching up.
The chart shows a contraction in coverage. It does not show a contraction in capability. We have just lived through three years of headline disinterest during what is arguably the strongest period of underlying advancement since 2012.
Why I think we are about to take off
Here is the contrarian read of the 2022–2024 anti-correlation: ChatGPT did not eclipse VR. ChatGPT finished VR’s missing piece.
Until late 2022, the metaverse pitch had a structural flaw nobody wanted to say out loud — there was nowhere near enough content to fill it. Game engines can build a city. The open metaverse needs millions of scenes, characters, dialogues, behaviours, simulations — produced at a speed and cost no human studio can sustain. Medical XR has the same problem, sharper: the content layer for surgical simulation must be clinically validated, anatomically faithful, and revised every time a procedure changes. Hand-authoring at that fidelity does not scale.
Agentic, Embodied, Generative and Spatial AI is the unlock. The same large-language models that consumed the metaverse’s capital in 2022 are turning into the production tools that make the metaverse economically viable. Text-to-3D, text-to-scene, dialogue synthetic embodied agents, procedural character behaviour, agentic AI-driven spatial simulation authoring — every one of these is now a working tool, not a research demo. My current research at ORamaVR sits in exactly that gap: how do you compose clinically faithful XR simulations from generative components, and how do you verify deterministically that the procedure they reproduce is the procedure a surgeon would recognise?
The thesis is not AI versus VR. The thesis is AI is the long-missing content layer of VR — and medical XR is one of the first domains where that asymmetry pays out, because the verification problem (does this simulation reproduce the clinical task correctly?) is already the field’s central technical question.
This is no longer an idiosyncratic view. Fei-Fei Li — whose 2009 ImageNet paper (Deng et al., 2009) is one of the foundational events on the AI timeline above — has spent 2024 and 2025 arguing that “AGI will not be complete without spatial intelligence … spatial intelligence is as critical as, and complementary to, language intelligence” (Li, IEEE Spectrum and Bloomberg interviews, 2024–2025). She founded World Labs on that thesis. As of February 2026, the company has raised over a billion dollars from Nvidia, AMD, and Autodesk, among others (Silicon Republic, 2026). Its co-founders include Ben Mildenhall, the co-creator of NeRF — the same neural-radiance-field substrate that appears on the graphics curve above. The people who built the substrate are now building the convergence.
When the researcher who lit the modern AI summer says the next frontier of AI runs through 3D scene understanding, and is funded at unicorn scale to prove it, the AI/VR convergence stops being a contrarian read. It is now the field’s stated direction.
Add to that the hardware curve finally reaching consumer-acceptable comfort, weight, and price. Add to that clinical validation in medical XR moving from research demo toward standard of care. Add to that two decades of compounded substrate held by the people who never left the field.
The chart says winter. What I see, from inside, is the last quiet moment before take-off — and the spring that follows will not look like the consumer-VR moment that just ended.
A closing note, to my colleagues who have lived through the cycles
We have seen this pattern before. In 2002, when LIFEPLUS first walked Roman characters through Pompeii on a home-built video-see-through HMD, and when the JUST simulator was demonstrating no-code authoring of clinical training scenarios, almost nobody outside our community was paying attention. Twenty-four years later, the same kinds of virtual humans live in every consumer headset on the market, and the architectural pattern we wrote then is recognisably the ancestor of what runs in clinically deployed XR today.
We are not a lonely cohort. The communities I have published in throughout this stretch — ACM SIGGRAPH, IEEE VR, ISMAR, MICCAI, Computer Graphics International — have continued to compound capability through every supposed winter. Nadia Magnenat Thalmann’s group at MIRALab. Walter Greenleaf at Stanford VHIL and the Medical Mixed Reality Center. The ENGAGE workshop community my colleagues and I founded in 2016, now an annual track at Computer Graphics International. We recognise each other across the ranks.
The mistake we should not make twice is reading headline silence as a signal of underlying stagnation. The substrate is compounding. The unlock has arrived. The interesting question is no longer whether VR matters. It is which startups, institutions and which research programmes have accumulated the technical depth to deploy what comes next responsibly — particularly in domains like medical training, where the stakes are clinical rather than recreational, and where the deployment partner’s track record matters more than the demo reel.
Companion artefact
The full eighty-three-year cycle timeline (1943–2026) with hover-context for sixty-four marker events, the data table of all forty-two charted events, and additional formats:
→ Timeline article (interactive HTML + reference tables) ↗
References
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Li, F.-F. (2009). ImageNet: A large-scale hierarchical image database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527
Kenanidis, E. et al. (2023). Validation of high-fidelity virtual reality simulation for hip-arthroplasty surgical training using the MAGES SDK. Reference reflects the published RCT outcome; verify exact title and journal at submission.
Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). 3D Gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (SIGGRAPH), 42(4), Article 139. https://doi.org/10.1145/3592433
Kolyvakis, P., Kamarianakis, M., & Papagiannakis, G. (2025). Geometric Algebra meets Large Language Models: Instruction-based transformations of separate meshes in 3D, interactive and controllable scenes. IEEE International Symposium on Emerging Metaverse (ISEMV), co-located with ICCV 2025. (Prior arXiv version: Angelis, Kolyvakis, Kamarianakis, & Papagiannakis, 2024, https://doi.org/10.48550/arXiv.2408.02275. Verify final venue and author order at submission.)
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems (NeurIPS), 25, 1097–1105.
Li, F.-F. (2023). The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI. Flatiron Books.
Li, F.-F. (2024). On spatial intelligence as the next frontier of AI. TED Talk and IEEE Spectrum / Bloomberg interviews, 2024–2025. Cited formulation: “AGI will not be complete without spatial intelligence.”
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5(4), 115–133. https://doi.org/10.1007/BF02478259
Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., & Ng, R. (2020). NeRF: Representing scenes as neural radiance fields for view synthesis. European Conference on Computer Vision (ECCV). https://doi.org/10.1007/978-3-030-58452-8_24
Minsky, M., & Papert, S. (1969). Perceptrons: An Introduction to Computational Geometry. MIT Press.
Papaefthymiou, M., Hildenbrand, D., & Papagiannakis, G. (2016). An inclusive Conformal Geometric Algebra GPU animation interpolation and deformation algorithm. The Visual Computer, 32(6–8), 751–759. https://doi.org/10.1007/s00371-016-1270-8 (Presented at Computer Graphics International 2016, Heraklion.)
Papagiannakis, G., Schertenleib, S., O’Kennedy, B., Arevalo-Poizat, M., Magnenat-Thalmann, N., Stoddart, A., & Thalmann, D. (2005). Mixing virtual and real scenes in the site of ancient Pompeii. Computer Animation and Virtual Worlds, 16(1), 11–24. https://doi.org/10.1002/cav.53
Petropoulos, J., Kamarianakis, M., Protopsaltis, A., & Papagiannakis, G. (2024). pyGANDALF — an open-source Geometric, ANimation, Directed, Algorithmic, Learning Framework for computer graphics. SIGGRAPH Asia 2024 Educator’s Forum, 1–9. https://doi.org/10.1145/3680533.3697057 (Best Education Paper Award.)
Ponder, M., Herbelin, B., Molet, T., Schertenlieb, S., Ulicny, B., Papagiannakis, G., Magnenat-Thalmann, N., & Thalmann, D. (2003). Immersive VR decision training: Telling interactive stories featuring advanced virtual human simulation technologies. Proceedings of the 9th Eurographics Workshop on Virtual Environments, 97–106. The Eurographics Association, ACM Press, Zurich, May 2003.
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386–408. https://doi.org/10.1037/h0042519
Silicon Republic (18 February 2026). Fei-Fei Li’s World Labs raises $1 bn to advance spatial intelligence — backed by Nvidia, AMD, Autodesk, Fidelity, Emerson Collective, Sea. https://www.siliconrepublic.com/start-ups/fei-fei-li-world-labs-raises-1bn-to-spatial-intelligence-ai-world-models-marble
Stephenson, N. (1992). Snow Crash. Bantam Books.
Sutherland, I. E. (1968). A head-mounted three-dimensional display. Proceedings of the AFIPS Fall Joint Computer Conference, 33, 757–764. https://doi.org/10.1145/1476589.1476686
Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460. https://doi.org/10.1093/mind/LIX.236.433
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 30.
Weiss, T., Bailenson, J., Bullock, K., & Greenleaf, W. (2021). Reality, from virtual to augmented. In Digital Health (pp. 275–303). Elsevier. https://doi.org/10.1016/B978-0-12-818914-6.00018-1
Citation notes: the primary references above (McCulloch & Pitts 1943, Turing 1950, Rosenblatt 1958, Minsky & Papert 1969, Sutherland 1968, Stephenson 1992, Hinton, Osindero & Teh 2006, Deng et al. 2009, Krizhevsky et al. 2012, Vaswani et al. 2017, Mildenhall et al. 2020, Kerbl et al. 2023, Weiss et al. 2021, Papagiannakis et al. 2005, Papaefthymiou, Hildenbrand & Papagiannakis 2016, Petropoulos et al. 2024, and Ponder et al. 2003) are verified primary sources. Kolyvakis et al. (2025) refers to a paper accepted at the IEEE International Symposium on Emerging Metaverse co-located with ICCV 2025; the underlying arXiv preprint (Angelis et al., 2024) is verifiable, but the final proceedings citation and author order should be confirmed at re-publication. Kenanidis et al. (2023) is referenced from the author’s own materials and should be verified for exact title and journal before re-publication. The Fei-Fei Li / World Labs material is sourced from contemporary press coverage (IEEE Spectrum, Reuters, Bloomberg, Silicon Republic) and from the World Labs corporate site; quoted formulations reflect Li’s public statements at TED, NeurIPS, and in press interviews 2024–2026.
This essay and the companion timeline article were produced with the assistance of Anthropic Claude, working from primary materials, prior research notes, and editorial direction.
