Skip to content

Conversation

bsmithai
Copy link

Problem:

create_link_remap logic specifies before creating the hard link that these file names can be truly unique. Currently these files are generated w/ ids matching sequentially with file descriptor ids. This change introduces generating a random suffix to prevent link_remap collisions when a dump fails and another dump is attempted on the same process that has open fds with links that were remapped.

Tested:

Tested w/ a process that opens and uses semaphores, the test creates remap files in /dev/shm that would have been created by criu on dump. A dump is done on this process and collisions are avoided. A restore is then done to verify nothing was damaged in C/R.

Very simple solution but I wanted to verify e2e.

Results:

Original bug:

(00.022698) Strip ' (deleted)' tag from './dev/shm/sem.8NKp2J (deleted)'
(00.022703) Warn  (criu/files-reg.c:1881): Can't link  -> ./dev/shm/link_remap.5
(00.022707) Error (criu/files-reg.c:1159): Can't link remap to /dev/shm/sem.8NKp2J: File exists
(00.022711) Error (criu/cr-dump.c:1570): Collect mappings (pid: 1401204) failed with -1
(00.022753) net: Unlock network

This branch:

(00.043028) Found regular file mapping, OK
(00.043041) Dumping path for -3 fd via self 12 [/dev/shm/sem.8NKp2J (deleted)]
(00.043046) Strip ' (deleted)' tag from './dev/shm/sem.8NKp2J (deleted)'
(00.043081) Only file size could be stored for validation for file /dev/shm/sem.8NKp2J
(00.043087) Handling VMA with the following smaps entry: 745051de5000-745051de7000 r--p 00037000 fc:01 17565282                   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
(00.043097) Found regular file mapping, OK
(00.043120) Handling VMA with the following smaps entry: 745051de7000-745051de9000 rw-p 00039000 fc:01 17565282                   /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
(00.043130) vma 745051de7000 borrows vfi from previous 745051de5000
(00.043134) Handling VMA with the following smaps entry: 7ffcde225000-7ffcde247000 rw-p 00000000 00:00 0                          [stack]
(00.043143) Handling VMA with the following smaps entry: ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]
(00.043150) Collected, longest area occupies 405 pages
(00.043153) 0x64e115337000-0x64e115338000 (4K) prot 0x1 flags 0x2 fdflags 0 st 0x41 off 0 reg fp  shmid: 0x1
(00.043156) 0x64e115338000-0x64e115339000 (4K) prot 0x5 flags 0x2 fdflags 0 st 0x41 off 0x1000 reg fp  shmid: 0x1
(00.043158) 0x64e115339000-0x64e11533a000 (4K) prot 0x1 flags 0x2 fdflags 0 st 0x41 off 0x2000 reg fp  shmid: 0x1
Original CRIU (/usr/local/bin/criu): Exit code 1
Modified CRIU (./criu/criu):         Exit code 0
Modified CRIU Restore:               Exit code 0

…que, name is now unique

Signed-off-by: Brandon Smith <brandon.smith@cedana.ai>
@adrianreber
Copy link
Member

Do you think you could also add a zdtm test? You mentioned that you have a test that is able to trigger the problem you described.

@avagin avagin self-assigned this Jun 17, 2025

mntns_root = mntns_get_root_fd(nsid);

fd = open("/dev/urandom", O_RDONLY);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we generate this uid once for the entire dump?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces generating a random suffix to prevent link_remap collisions when a dump fails and another dump is attempted on the same process

should we generate this uid once for the entire dump?

We already do this for dump_criu_run_id. Can we use this value instead of generating a new one?

}

// create file link_remap.fd_id.random_suffix id
snprintf(tmp + 1, sizeof(link_name) - (size_t)(tmp - link_name) - 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we customize a link name if linkat_hard returns EEXIST?

unsigned char rand_bytes[8];
if (read(fd, rand_bytes, sizeof(rand_bytes)) == sizeof(rand_bytes)) {
snprintf(random_suffix, sizeof(random_suffix), "%02x%02x%02x%02x",
rand_bytes[0], rand_bytes[1], rand_bytes[2], rand_bytes[3]);
Copy link
Member

@avagin avagin Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you read 8 bytes, if you use just four of them?

(unsigned long)(ts.tv_sec ^ ts.tv_nsec));
}

// create file link_remap.fd_id.random_suffix id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use tabs for indentation

}

/* Use grand parent, if parent directory does not exist. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

white spaces...

@rst0git
Copy link
Member

rst0git commented Jun 17, 2025

Tested w/ a process that opens and uses semaphores, the test creates remap files in /dev/shm that would have been created by criu on dump. A dump is done on this process and collisions are avoided. A restore is then done to verify nothing was damaged in C/R.

Would you be able to include this test, for example, as part of ZDTM or test/others?

@bsmithai
Copy link
Author

bsmithai commented Jun 17, 2025 via email

Copy link

A friendly reminder that this PR had no activity for 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants