I will try to follow LXC code to check how is actually running the container. This is more like a personal exercise to understand all the deal with capabilities, namespaces, apparmor and selinux.
I will follow the code at: https://github.com/lxc/lxc
I’m interested first just in the command lxc-start it start in src/lxc/lxc_start.c:100 what it does is to call (importan things):
lxc_caps_initdefined in src/lxc/caps.c:139 this is just like checking for correct permissions to set caps later.
lxc_conf_initdefined in src/lxc/conf.c:1737 initialize the conf structure ans some lists for later usages in cgroup, network, mount_list and cap.
lxc_startdefined in src/lxc/start.c:784 this calls __lxc_start that does almost all the job.
This function call
lxc_init function that checks for correct privieges for caps, ask memory for the lxc_handler initialize apparmor (I will leave all the apparmor for later). This also read some configurations for seccomp, which seems really nice especially the mode 2, but all this will be analyzed later.
A call to
lxc_set_state change the state to STARTING, then the “pre-start” hook is called. Then then internal ttys are created according to the config file using
lxc_create_tty as pty in the parent side (no fork yet) but closed them in the fork, just the name is keeped for the child?. After the tty
lxc_create_console is called and the console is created, this can be a simple file with logs or it can be the current console, in any way is handled also with a pty.
We reach now to onw fork, first we set a signal fd to handle signals in the case of an early dead child before its own handlers are setup, since the child inherit this fd it will be able to listen to it.
lxc_init return finally the handler with pointers to the data created previously.
We now drop the
CAP_SYS_BOOT capability that doesn’t let the process to reboot the host machine.
Mmm, strace is showing me a clone around here but I can’t see it in the code, it basically reads the mounts, it seems to be the function
get_cgroup_mount in src/lxc/cgroup.c, for sure is the code in do_start in start.c that is basically the handler in the clone syscall, it should be in this place.