-
Notifications
You must be signed in to change notification settings - Fork 399
VIP 17: Enable Unix domain sockets for listen and backend addresses
Allow Unix Domain Sockets (UDS) as listen addresses for Varnish (-a
, -T
and -M
options) and as addresses for backends. Ideally also obtain credentials of the peer process connected on a UDS, such as uid
and gid
, for use in VCL.
This is not directly related to UDS, but this change would solve some of the problems and mitigate some the complexity raised by the original draft. Because this change has already been accepted, there is no VIP to link to, and no documentation to refer to until it is implemented. For convenience it is described here.
This feature is similar to how storage backends are exposed in VCL, they have a name that can then be used in VCL, and when a name is omitted, generic names are attributed (s0
, s1
, sN
etc).
Example: varnishd -s malloc,10G -s video=malloc,100G [...]
You end up with 3 storage backends called s0
, video
and Transient
, and as such have access in VCL to the following symbols and their respective fields:
storage.s0
storage.video
storage.Transient
- (and
storage.<name>.*
, see man vcl)
You can then have this kind of logic in VCL:
sub vcl_backend_response {
if (beresp.http.content-type ~ "video") {
set beresp.storage = storage.video;
} else {
set beresp.storage = storage.s0;
}
}
The advantage of beresp.storage
over beresp.storage_hint
is the strong typing guaranteeing that VCL won't compile if there is a typo in the storage name.
Named listen addresses will work like storage backends in that regard (generic names being a0
, a1
, aN
etc).
Example: varnishd -a public_http=:80 -a public_https=:8443,PROXY -a admin=:1234 [...]
You can then use the logical names in your VCL:
sub vcl_recv {
if (local.address == listen_address.public_http) {
# do an https redirect for example
}
if (req.method == "PURGE") {
if (local.address != listen_address.admin) {
return (synth(405));
}
return (purge);
}
}
Actual names of the variables used to access this information in VCL hasn't been decided yet.
The benefits are the ability to reuse the same VCL when all varnishd
instances in a cluster may not be able to provide consistent listen interfaces or port numbers.
the string conversion section hasn't been discussed yet
Objects of type listen_address could be used where strings are expected and be converted to the address part of the -a
option (that is, excluding the parameters).
Example: varnishd -a public_http=:80 -a public_https=:8443,PROXY -a admin=:1234 [...]
sub vcl_deliver {
set resp.http.Address = local.address;
}
A non-synthetic response may contain one of the following headers:
Address: :80
Address: :8443
In the case of unix domain sockets, automatic conversion to a string could be used for regular expression matching of the paths for example:
sub vcl_recv {
if (req.method == "PURGE") {
# there may be more than one admin UDS
if (local.address !~ "admin\.sock$") {
return (synth(405));
}
return (purge);
}
}
This is not a security feature despite what all the examples above may suggest. Using this as a security measures implies the assumption that the network is actually secured before traffic hits Varnish on the admin
listen address for example (firewalls and all that jazz).
phk: I don't agree entirely, the root@ may want to restrict the paths to backends.
dridi: I'm not sure I understand, this is not about UDS yet, only named listen addresses in general.
We can expose additional macros for listen addresses. For example with a v1
varnish instance:
-
v1_addr
: the first listen address -
v1_port
: the first listen port -
v1_sock
: the first listen address+port -
v1_addr_a0
:a0
's listen address -
v1_port_a0
:a0
's listen port -
v1_sock_a0
:a0
's listen address+port
Once again strong typing, because port numbers in VCL and in the varnishd
command line may get out of sync without being noticed. Here a typo in the name prevents the VCL from compiling. It's also a transport-independent alternative to ACLs, as shown in the purge example above.
Being transport-independent, it also means that it can accommodate future transports, like for example unix domain sockets described below.
The main reason to use a UDS is that it works like TCP sockets (reliable bidirectional byte stream behind a file descriptor) and would likely not be too intrusive in the existing code base.
Other noteworthy reasons:
- Eliminate the overhead of TCP/loopback for connections with peers that are co-located on a host with Varnish
- The possibility to query the peer process credentials and restrict access using regular filesystem permissions
A common case for co-locating Varnish with a peer is the need of a TLS proxy for HTTPS. On both client and backend sides, a UDS should work seamlessly with the PROXY protocol.
On the listen side, expecting an absolute path would prevent ambiguity with IP addresses or ports:
varnishd -a /path/to/http.sock -T /path/to/cli.sock [...]
As it is common with other varnishd options, we can pass additional parameters:
varnishd -T /path/to/cli.sock,uid=varnish,gid=varnish
However this introduces an ambiguity for PROXY protocol in the -a
option. The syntax can be changed to:
varnishd -a /path/to/http.sock,proto=<proto>,uid=varnish,mode=0600 [...]
The -M
option being of the connect persuasion, it wouldn't take additional parameters to the absolute path.
On the backend side we can avoid ambiguity by introducing a new .path
field:
backend local {
.path = "/path/to/backend.sock";
# or maybe .unix or .uds instead?
}
The .path
field would be enough in itself to declare a backend (like .host
) and would be mutual exclusive with .host
and .port
.
By adding a parameter (for example uds_path
) akin to vcl_path
and vmod_path
to maintain a PATH where to look sockets up we could allow relative paths on the backend side.
Getting the peer credentials is not portable, and the least common denominator seems to be the euid
and egid
. We probably want to extract them both as names and numbers. See Geoff's draft for the technical details.
The backend notation was already described above, but filed under the "notation" category rather than VCL. This section is more about the VCL changes in the context of a transaction.
The obvious implication of a UDS listen address is the lack of values for the *.ip
variables (same on the backend side for beresp.backend.ip
).
This could be solved by making all uses of VCL_IP
gracefully fail in the presence of a NULL
IP address. So an ACL match '~' would always fail and a negative match '!~' would always succeed.
What happens when a UDS gets IP addresses from a PROXY header? One solution could be to set server.ip
and client.ip
accordingly and leave the local.ip
and remote.ip
variables NULL
. It would preserve this pattern:
sub vcl_recv {
if (local.ip != server.ip) {
# PROXY protocol detected
set req.X-Forwarded-Proto = "https"; # for instance
}
}
Much like we may access port numbers via *.ip
variables, we want to access credentials of a UDS peer. We can do that using the std
VMOD.
In the case of std.port
, it could fail gracefully like ACLs when a NULL
IP address is submitted by returning -1
.
The std
VMOD could then learn new functions:
std.uid
std.gid
std.uid_name
std.gid_name
Example:
import std;
sub vcl_recv {
std.log("euid: " + std.uid(local.address));
}
If local.address
is not a UDS, numeric variants could also return -1
and name variants could return NULL
. The functions could also take fallback parameters, possibly with a default value to the ones suggested (-1
and NULL
).
The consensus seems to lean towards naming functions by omitting the "effective" e from e[ug]id
.
This variable should obviously be NULL
in the case of a UDS backend if we follow the rules described above. However it is already possible to write a backend implementation not based on TCP/IP (see fsbackend for example) and NULL
seems to already be the way to go.
The question here is more whether we need something like beresp.backend.path
in addition to the ip
field. Same question for peer credentials, they probably don't make sense for backends (and that would keep the new std
functions limited to the listen addresses type).
For std.uid
to provide anything useful, we need a peer that a static listen_address.<name>
has no reason to have. To enable strong typing, the ==
operator should be backed by a VRT function that checks for equivalence except for the peer. The structs behind listen_address.<name>
could have a negative file descriptor for the peer for example.
Another possible useful VRT function would be to find the corresponding listen_address.<name>
of a local.address
for VMODs looking for a safe pointer outliving a transaction.
- phk: What happens to struct suckaddr ? We added that to avoid lugging around sockaddr_storage all over the place and it shaves something like 4x96 bytes off the size of a session ?
dridi: In the case of a UDS, we can keep track of the sockaddr_un with the rest of the -a
parameters and use a pointer to that "pseudo-static" struct in the suckaddr union. That shouldn't increase the overall size.
- phk: On the VCL side, what happens if in the future a jail performs a
chroot
? Users would have similar problems with today'sstd.fileread
.
dridi: That would indeed be a problem for backends.
- phk: During the first planning session for Varnish 6 we agreed that UDS addresses would be kept separate from suckaddr. (How?)
dridi: See question 1, then we can figure what to do in code branching on the suckaddr type.
-
dridi: Is the question of naming from the original draft still relevant?
-
phk: What happens if the VCL asks for remote.ip.port() ?
dridi: I'm supposed to answer that in the VCL/VRT section but I haven't yet. I need to browse the planning session logs because I think we agreed that with the lack of IP address, *.ip
should be NULL
and IP-related facilities (eg. ACLs, std.port
...) should gracefully fail if they encounter NULL
.
- phk: What happens if the VCL asks for remote.ip.uid() on a IPv4/6 socket ?
dridi: Same as question 5, although with subtle differences. In both questions the syntax is wrong anyway.
- dridi: the section on
beresp.backend.ip
needs further discussion too.