Add EgoExo Forge and VistaDream examples #11883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

danielduberg wants to merge 1 commit into rerun-io:main from danielduberg:dduberg/egoexo_forge_and_vistadream_examples

+75 −0

Member

danielduberg commented Nov 13, 2025

What

This adds the two external examples EgoExo Forge and VistaDream.

egoexo_forge.mov

vistadream.mov

Checklist

I have read and agree to Contributor Guide and the Code of Conduct
I've included a screenshot or gif (if applicable)
The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG

To run all checks from main, comment on the PR with @rerun-bot full-check.


          Added EgoExo Forge and VistaDream examples

23de79b

danielduberg requested review from ntjohnson1 and oxkitsune

November 13, 2025 01:18

danielduberg added the examples label

github-actions bot reviewed

View reviewed changes

github-actions bot left a comment

Hi! Thanks for opening this pull request.

Because this is your first time contributing to this repository, make sure you've read our Contributor Guide and Code of Conduct.

oxkitsune added the include in changelog label

oxkitsune requested changes

View reviewed changes

examples/python/vistadream/README.md


		This is an external example. Check the [repository](https://github.com/rerun-io/vistadream) for more information.

		Requires: Linux with NVIDIA GPU (tested with CUDA 12.9)

Member

oxkitsune Nov 13, 2025

Suggested change

      
            **Requires: Linux** with **NVIDIA GPU** (tested with CUDA 12.9)
          
            **Requires**: Linux with an NVIDIA GPU (tested with CUDA 12.9)

examples/python/egoexo_forge/README.md

		@@ -0,0 +1,35 @@
		<!--[metadata]
		title = "EgoExo Forge"

Member

oxkitsune Nov 13, 2025

Suggested change

      
            title = "EgoExo Forge"
          
            title = "EgoExo Forge" <!-- NOLINT -->

I think ignoring the lint here is fine, as it's the name of the actual project.

I hope the lint can be skipped like this.

examples/python/egoexo_forge/README.md


		https://vimeo.com/1134260310?autoplay=1&loop=1&autopause=0&background=1&muted=1&ratio=2386:1634

		A comprehensive collection of datasets and tools for egocentric and exocentric human activity understanding, featuring hand-object interactions, manipulation tasks, and multi-view recordings.

Member

oxkitsune Nov 13, 2025

Suggested change

      
            A comprehensive collection of datasets and tools for egocentric and exocentric human activity understanding, featuring hand-object interactions, manipulation tasks, and multi-view recordings.
          
            A collection of datasets and tools for egocentric and exocentric human activity understanding, featuring hand-object interactions, manipulation tasks, and multi-view recordings.

examples/python/egoexo_forge/README.md


		## Background

		EgoExo Forge provides a consistent labeling scheme and data layout for multiple different egocentric and exocentric human datasets, that have different sensor configurations and annotations.

Member

oxkitsune Nov 13, 2025

Suggested change

      
            EgoExo Forge provides a consistent labeling scheme and data layout for multiple different egocentric and exocentric human datasets, that have different sensor configurations and annotations.
          
            EgoExo Forge provides a consistent labeling scheme and data layout across multiple egocentric and exocentric human datasets with varying sensor configurations and annotations.

I think this is a bit easier to read.

examples/python/egoexo_forge/README.md

Comment on lines +19 to +21

+              * [Assembly101](https://assembly-101.github.io/) from Meta. A procedural activity dataset with 4321 multi-view videos of people assembling and disassembling 101 take-apart toy vehicles, featuring rich variations in action ordering, mistakes, and corrections.
+              * [HO-Cap](https://irvlutd.github.io/HOCap/) from Nvidia and the University of Texas at Dallas. A dataset for 3D reconstruction and pose tracking of hands and objects in videos, featuring humans interacting with objects for various tasks including pick-and-place actions and handovers.
+              * [EgoDex](https://arxiv.org/abs/2505.11709) from Apple. The largest and most diverse dataset of dexterous human manipulation with 829 hours of egocentric video and paired 3D hand tracking, covering 194 different tabletop tasks with everyday household objects.

Member

oxkitsune Nov 13, 2025

Suggested change

      
            * [Assembly101](https://assembly-101.github.io/) from Meta. A procedural activity dataset with 4321 multi-view videos of people assembling and disassembling 101 take-apart toy vehicles, featuring rich variations in action ordering, mistakes, and corrections.
          
            * [HO-Cap](https://irvlutd.github.io/HOCap/) from Nvidia and the University of Texas at Dallas. A dataset for 3D reconstruction and pose tracking of hands and objects in videos, featuring humans interacting with objects for various tasks including pick-and-place actions and handovers.
          
            * [EgoDex](https://arxiv.org/abs/2505.11709) from Apple. The largest and most diverse dataset of dexterous human manipulation with 829 hours of egocentric video and paired 3D hand tracking, covering 194 different tabletop tasks with everyday household objects.
          
            * [Assembly101](https://assembly-101.github.io/): A procedural activity dataset with 4321 multi-view videos of people assembling and disassembling 101 take-apart toy vehicles, featuring rich variations in action ordering, mistakes, and corrections.
          
            * [HO-Cap](https://irvlutd.github.io/HOCap/): A dataset for 3D reconstruction and pose tracking of hands and objects in videos, featuring humans interacting with objects for various tasks including pick-and-place actions and handovers.
          
            * [EgoDex](https://arxiv.org/abs/2505.11709): The largest and most diverse dataset of dexterous human manipulation with 829 hours of egocentric video and paired 3D hand tracking, covering 194 different tabletop tasks with everyday household objects.

Do we need to mention the authors in the same line when we link to the page already? And a colon before the description is a good idea I think.

examples/python/egoexo_forge/README.md

Comment on lines +27 to +35

+              Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run
+              ```sh
+              git clone https://github.com/rerun-io/egoexo-forge.git
+              cd egoexo-forge
+              pixi run app
+              ```
+              You can try the example on a HuggingFace space [here](https://pablovela5620-egoexo-forge-viewer.hf.space/).

Member

oxkitsune Nov 13, 2025

Suggested change

      
            Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run
          
            ```sh
          
            git clone https://github.com/rerun-io/egoexo-forge.git
          
            cd egoexo-forge
          
            pixi run app
          
            ```
          
            You can try the example on a HuggingFace space [here](https://pablovela5620-egoexo-forge-viewer.hf.space/).
          
            You can try the example on a HuggingFace space [here](https://pablovela5620-egoexo-forge-viewer.hf.space/).
          
            Or locally, make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run
          
            ```sh
          
            git clone https://github.com/rerun-io/egoexo-forge.git
          
            cd egoexo-forge
          
            pixi run app


I think mentioning the HF space first makes sense, because then readers can immediately click and try it out. At the end it's a bit hidden

examples/python/vistadream/README.md

+              VistaDream addresses the challenge of 3D scene reconstruction from a single image through a novel two-stage pipeline:
+. **Coarse 3D Scaffold Construction**: Creates a global scene structure by outpainting image boundaries and estimating depth maps.
+. **Multi-view Consistency Sampling (MCS)**: Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views.

Member

oxkitsune Nov 13, 2025

Suggested change

      
            2. **Multi-view Consistency Sampling (MCS)**: Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views.
          
            2. **Multi-view Consistency Sampling**: Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views.

MCS is never referenced again, so no need to introduce it.

MichaelGrupp reviewed

View reviewed changes

examples/python/vistadream/README.md


		Requires: Linux with NVIDIA GPU (tested with CUDA 12.9)

		Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run

Member

MichaelGrupp Nov 13, 2025

Nit: missing colon

Suggested change

      
            Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run
          
            Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run:

examples/python/vistadream/README.md

+. **Coarse 3D Scaffold Construction**: Creates a global scene structure by outpainting image boundaries and estimating depth maps.
+. **Multi-view Consistency Sampling (MCS)**: Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views.
+              The framework integrates multiple state-of-the-art models:

Member

MichaelGrupp Nov 13, 2025

I wouldn't count Rerun as a model, although it's state of the art 🤓 How about just:

Suggested change

      
            The framework integrates multiple state-of-the-art models:
          
            The framework utilizes:

Member

MichaelGrupp Nov 13, 2025 •

edited

Loading

Actually this list carries the same information as the first paragraph of this README. Do we need both?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples include in changelog