Skip to content

Conversation

@danielduberg
Copy link
Member

What

This adds the two external examples EgoExo Forge and VistaDream.

egoexo_forge.mov
vistadream.mov

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG

To run all checks from main, comment on the PR with @rerun-bot full-check.

@danielduberg danielduberg added the examples Issues relating to the Rerun examples label Nov 13, 2025
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Thanks for opening this pull request.

Because this is your first time contributing to this repository, make sure you've read our Contributor Guide and Code of Conduct.


This is an external example. Check the [repository](https://github.com/rerun-io/vistadream) for more information.

**Requires: Linux** with **NVIDIA GPU** (tested with CUDA 12.9)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Requires: Linux** with **NVIDIA GPU** (tested with CUDA 12.9)
**Requires**: Linux with an NVIDIA GPU (tested with CUDA 12.9)

@@ -0,0 +1,35 @@
<!--[metadata]
title = "EgoExo Forge"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title = "EgoExo Forge"
title = "EgoExo Forge" <!-- NOLINT -->

I think ignoring the lint here is fine, as it's the name of the actual project.

I hope the lint can be skipped like this.


https://vimeo.com/1134260310?autoplay=1&loop=1&autopause=0&background=1&muted=1&ratio=2386:1634

A comprehensive collection of datasets and tools for egocentric and exocentric human activity understanding, featuring hand-object interactions, manipulation tasks, and multi-view recordings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A comprehensive collection of datasets and tools for egocentric and exocentric human activity understanding, featuring hand-object interactions, manipulation tasks, and multi-view recordings.
A collection of datasets and tools for egocentric and exocentric human activity understanding, featuring hand-object interactions, manipulation tasks, and multi-view recordings.


## Background

EgoExo Forge provides a consistent labeling scheme and data layout for multiple different egocentric and exocentric human datasets, that have different sensor configurations and annotations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
EgoExo Forge provides a consistent labeling scheme and data layout for multiple different egocentric and exocentric human datasets, that have different sensor configurations and annotations.
EgoExo Forge provides a consistent labeling scheme and data layout across multiple egocentric and exocentric human datasets with varying sensor configurations and annotations.

I think this is a bit easier to read.

Comment on lines +19 to +21
* [Assembly101](https://assembly-101.github.io/) from Meta. A procedural activity dataset with 4321 multi-view videos of people assembling and disassembling 101 take-apart toy vehicles, featuring rich variations in action ordering, mistakes, and corrections.
* [HO-Cap](https://irvlutd.github.io/HOCap/) from Nvidia and the University of Texas at Dallas. A dataset for 3D reconstruction and pose tracking of hands and objects in videos, featuring humans interacting with objects for various tasks including pick-and-place actions and handovers.
* [EgoDex](https://arxiv.org/abs/2505.11709) from Apple. The largest and most diverse dataset of dexterous human manipulation with 829 hours of egocentric video and paired 3D hand tracking, covering 194 different tabletop tasks with everyday household objects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* [Assembly101](https://assembly-101.github.io/) from Meta. A procedural activity dataset with 4321 multi-view videos of people assembling and disassembling 101 take-apart toy vehicles, featuring rich variations in action ordering, mistakes, and corrections.
* [HO-Cap](https://irvlutd.github.io/HOCap/) from Nvidia and the University of Texas at Dallas. A dataset for 3D reconstruction and pose tracking of hands and objects in videos, featuring humans interacting with objects for various tasks including pick-and-place actions and handovers.
* [EgoDex](https://arxiv.org/abs/2505.11709) from Apple. The largest and most diverse dataset of dexterous human manipulation with 829 hours of egocentric video and paired 3D hand tracking, covering 194 different tabletop tasks with everyday household objects.
* [Assembly101](https://assembly-101.github.io/): A procedural activity dataset with 4321 multi-view videos of people assembling and disassembling 101 take-apart toy vehicles, featuring rich variations in action ordering, mistakes, and corrections.
* [HO-Cap](https://irvlutd.github.io/HOCap/): A dataset for 3D reconstruction and pose tracking of hands and objects in videos, featuring humans interacting with objects for various tasks including pick-and-place actions and handovers.
* [EgoDex](https://arxiv.org/abs/2505.11709): The largest and most diverse dataset of dexterous human manipulation with 829 hours of egocentric video and paired 3D hand tracking, covering 194 different tabletop tasks with everyday household objects.

Do we need to mention the authors in the same line when we link to the page already? And a colon before the description is a good idea I think.

Comment on lines +27 to +35
Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run

```sh
git clone https://github.com/rerun-io/egoexo-forge.git
cd egoexo-forge
pixi run app
```

You can try the example on a HuggingFace space [here](https://pablovela5620-egoexo-forge-viewer.hf.space/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run
```sh
git clone https://github.com/rerun-io/egoexo-forge.git
cd egoexo-forge
pixi run app
```
You can try the example on a HuggingFace space [here](https://pablovela5620-egoexo-forge-viewer.hf.space/).
You can try the example on a HuggingFace space [here](https://pablovela5620-egoexo-forge-viewer.hf.space/).
Or locally, make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run
```sh
git clone https://github.com/rerun-io/egoexo-forge.git
cd egoexo-forge
pixi run app

I think mentioning the HF space first makes sense, because then readers can immediately click and try it out. At the end it's a bit hidden

VistaDream addresses the challenge of 3D scene reconstruction from a single image through a novel two-stage pipeline:

1. **Coarse 3D Scaffold Construction**: Creates a global scene structure by outpainting image boundaries and estimating depth maps.
2. **Multi-view Consistency Sampling (MCS)**: Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. **Multi-view Consistency Sampling (MCS)**: Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views.
2. **Multi-view Consistency Sampling**: Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views.

MCS is never referenced again, so no need to introduce it.


**Requires: Linux** with **NVIDIA GPU** (tested with CUDA 12.9)

Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing colon

Suggested change
Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run
Make sure you have the [Pixi package manager](https://pixi.sh/latest/#installation) installed and run:

1. **Coarse 3D Scaffold Construction**: Creates a global scene structure by outpainting image boundaries and estimating depth maps.
2. **Multi-view Consistency Sampling (MCS)**: Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views.

The framework integrates multiple state-of-the-art models:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't count Rerun as a model, although it's state of the art 🤓 How about just:

Suggested change
The framework integrates multiple state-of-the-art models:
The framework utilizes:

Copy link
Member

@MichaelGrupp MichaelGrupp Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this list carries the same information as the first paragraph of this README. Do we need both?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples Issues relating to the Rerun examples include in changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants