Franka 6-DoF Grasping with GraspNet

This repository demonstrates 6-DoF grasping on the Franka FR3 using GraspNet. It integrates modern PyTorch, stereo depth estimation, and lightweight robot control for reproducible and extensible research.

Key Features

Easy to use: Pure Python implementation, no ROS required. Robot control via franky.
Modern PyTorch: Upgraded to PyTorch 2.x with CUDA 12.x.
- Includes a modified knn-cuda adapted for CUDA 12.x.
Stereo depth estimation: Uses D435i IR stereo + FoundationStereo for better depth quality compared to RGB-D.
Multiple demos: Built on top of GraspNet-Baseline, with major modifications in franka_graspnet/ and scripts/.

Tested Environment

OS: Ubuntu 22.04
CUDA: 12.4
Python: 3.12
Torch: 2.6.0
Camera: Intel RealSense D435i
Franky: franky (libfranka 0.15.0)

Installation

Create environment

conda create -n graspnet python==3.12
conda activate graspnet

pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt
pip install setuptools==78.0.1

# libfranka 0.15
pip install franky-control
pip install pyrealsense2

# install graspnet dependencies
cd pointnet2
python setup.py install
cd ..

cd knn
python setup.py install
cd ..

cd graspnetAPI
pip install .
cd ..

# (Optional) install FoundationStereo dependencies
pip install -r requirements_fs.txt

Download checkpoints

Organize checkpoints as follows:

checkpoints/
├── foundation_stereo
│   ├── 11-33-40
│   │   ├── cfg.yaml
│   │   └── model_best_bp2.pth
│   └── 23-51-11
│       ├── cfg.yaml
│       └── model_best_bp2.pth
└── graspnet
    ├── checkpoint-kn.tar
    └── checkpoint-rs.tar

Usage

Note: We adopt an eye-in-hand setup. You can modify the calibration parameters in franka_controller to adapt them to your own system.

Test GraspNet

python scripts/graspnet_demo.py

Test FoundationStereo

python scripts/fs_demo.py

RealSense + GraspNet (6-DoF grasp prediction, press `q` to fetch next frame)

python scripts/graspnet_rs_demo.py

RealSense + FoundationStereo (stereo depth estimation, press `space` to refresh, `q` to quit)

python scripts/fs_rs_demo.py

RealSense + FoundationStereo + GraspNet (6-DoF grasp prediction, press `q` to fetch next frame)

python scripts/graspnet_fs_rs_demo.py

Franka + RealSense RGB-D + GraspNet (real-time grasping)

python scripts/franky_graspnet_rs_demo.py

Franka + RealSense IR + FoundationStereo + GraspNet (real-time grasping)

python scripts/franky_graspnet_fs_demo.py

Known Issues

When controlling FR3 with franky, motion may sometimes abort with reflex errors such as:

franky._franky.ControlException: libfranka: Move command aborted: motion aborted by reflex! ["joint_velocity_violation"]

franky._franky.ControlException: libfranka: Move command aborted: motion aborted by reflex! ["cartesian_reflex"]

Currently, the best set of control parameters is unclear. Trajectory planners (e.g., curobo) might help. Contributions, suggestions, and parameter tuning advice are highly welcome! 🚀

GraspNet Baseline

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020).

[paper] [dataset] [API] [doc]

Top 50 grasps detected by our baseline model.

Requirements

Python 3
PyTorch 1.6
Open3d >=0.8
TensorBoard 2.3
NumPy
SciPy
Pillow
tqdm

Installation

Get the code.

git clone https://github.com/graspnet/graspnet-baseline.git
cd graspnet-baseline

Install packages via Pip.

pip install -r requirements.txt

Compile and install pointnet2 operators (code adapted from votenet).

cd pointnet2
python setup.py install

Compile and install knn operator (code adapted from pytorch_knn_cuda).

cd knn
python setup.py install

Install graspnetAPI for evaluation.

git clone https://github.com/graspnet/graspnetAPI.git
cd graspnetAPI
pip install .

Tolerance Label Generation

Tolerance labels are not included in the original dataset, and need additional generation. Make sure you have downloaded the orginal dataset from GraspNet. The generation code is in dataset/generate_tolerance_label.py. You can simply generate tolerance label by running the script: (--dataset_root and --num_workers should be specified according to your settings)

cd dataset
sh command_generate_tolerance_label.sh

Or you can download the tolerance labels from Google Drive/Baidu Pan and run:

mv tolerance.tar dataset/
cd dataset
tar -xvf tolerance.tar

Training and Testing

Training examples are shown in command_train.sh. --dataset_root, --camera and --log_dir should be specified according to your settings. You can use TensorBoard to visualize training process.

Testing examples are shown in command_test.sh, which contains inference and result evaluation. --dataset_root, --camera, --checkpoint_path and --dump_dir should be specified according to your settings. Set --collision_thresh to -1 for fast inference.

The pretrained weights can be downloaded from:

checkpoint-rs.tar [Google Drive] [Baidu Pan]
checkpoint-kn.tar [Google Drive] [Baidu Pan]

checkpoint-rs.tar and checkpoint-kn.tar are trained using RealSense data and Kinect data respectively.

Demo

A demo program is provided for grasp detection and visualization using RGB-D images. You can refer to command_demo.sh to run the program. --checkpoint_path should be specified according to your settings (make sure you have downloaded the pretrained weights, we recommend the realsense model since it might transfer better). The output should be similar to the following example:

Try your own data by modifying get_and_process_data() in demo.py. Refer to doc/example_data/ for data preparation. RGB-D images and camera intrinsics are required for inference. factor_depth stands for the scale for depth value to be transformed into meters. You can also add a workspace mask for denser output.

Results

Results "In repo" report the model performance with single-view collision detection as post-processing. In evaluation we set --collision_thresh to 0.01.

Evaluation results on RealSense camera:

		Seen			Similar			Novel
	AP	AP_0.8	AP_0.4	AP	AP_0.8	AP_0.4	AP	AP_0.8	AP_0.4
In paper	27.56	33.43	16.95	26.11	34.18	14.23	10.55	11.25	3.98
In repo	47.47	55.90	41.33	42.27	51.01	35.40	16.61	20.84	8.30

Evaluation results on Kinect camera:

		Seen			Similar			Novel
	AP	AP_0.8	AP_0.4	AP	AP_0.8	AP_0.4	AP	AP_0.8	AP_0.4
In paper	29.88	36.19	19.31	27.84	33.19	16.62	11.51	12.92	3.56
In repo	42.02	49.91	35.34	37.35	44.82	30.40	12.17	15.17	5.51

Citation

Please cite our paper in your publications if it helps your research:

@article{fang2023robust,
  title={Robust grasping across diverse sensor qualities: The GraspNet-1Billion dataset},
  author={Fang, Hao-Shu and Gou, Minghao and Wang, Chenxi and Lu, Cewu},
  journal={The International Journal of Robotics Research},
  year={2023},
  publisher={SAGE Publications Sage UK: London, England}
}

@inproceedings{fang2020graspnet,
  title={GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping},
  author={Fang, Hao-Shu and Wang, Chenxi and Gou, Minghao and Lu, Cewu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR)},
  pages={11444--11453},
  year={2020}
}

License

All data, labels, code and models belong to the graspnet team, MVIG, SJTU and are freely available for free non-commercial use, and may be redistributed under these conditions. For commercial queries, please drop an email at fhaoshu at gmail_dot_com and cc lucewu at sjtu.edu.cn .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Franka 6-DoF Grasping with GraspNet

Key Features

Tested Environment

Installation

Create environment

Download checkpoints

Usage

Test GraspNet

Test FoundationStereo

RealSense + GraspNet (6-DoF grasp prediction, press `q` to fetch next frame)

RealSense + FoundationStereo (stereo depth estimation, press `space` to refresh, `q` to quit)

RealSense + FoundationStereo + GraspNet (6-DoF grasp prediction, press `q` to fetch next frame)

Franka + RealSense RGB-D + GraspNet (real-time grasping)

Franka + RealSense IR + FoundationStereo + GraspNet (real-time grasping)

Known Issues

GraspNet Baseline

Requirements

Installation

Tolerance Label Generation

Training and Testing

Demo

Results

Citation

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
dataset		dataset
doc		doc
foundation_stereo		foundation_stereo
franka_graspnet		franka_graspnet
graspnetAPI		graspnetAPI
knn		knn
models		models
pointnet2		pointnet2
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
command_demo.sh		command_demo.sh
command_test.sh		command_test.sh
command_train.sh		command_train.sh
demo.py		demo.py
install.sh		install.sh
requirements.txt		requirements.txt
requirements_fs.txt		requirements_fs.txt
test.py		test.py
train.py		train.py

License

lif314/franka-graspnet

Folders and files

Latest commit

History

Repository files navigation

Franka 6-DoF Grasping with GraspNet

Key Features

Tested Environment

Installation

Create environment

Download checkpoints

Usage

Test GraspNet

Test FoundationStereo

RealSense + GraspNet (6-DoF grasp prediction, press q to fetch next frame)

RealSense + FoundationStereo (stereo depth estimation, press space to refresh, q to quit)

RealSense + FoundationStereo + GraspNet (6-DoF grasp prediction, press q to fetch next frame)

Franka + RealSense RGB-D + GraspNet (real-time grasping)

Franka + RealSense IR + FoundationStereo + GraspNet (real-time grasping)

Known Issues

GraspNet Baseline

Requirements

Installation

Tolerance Label Generation

Training and Testing

Demo

Results

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

RealSense + GraspNet (6-DoF grasp prediction, press `q` to fetch next frame)

RealSense + FoundationStereo (stereo depth estimation, press `space` to refresh, `q` to quit)

RealSense + FoundationStereo + GraspNet (6-DoF grasp prediction, press `q` to fetch next frame)

Packages