Skip to content

Commit 1c25676

Browse files
authored
docs: document avaiable security measures (#2270)
docs: document available security measures Several security measures can be used to mitigate risk when processing potentially malicious input. This change adds documentation about available security measures and examples and tests that illustrate their usage.
1 parent 60d98db commit 1c25676

File tree

13 files changed

+719
-15
lines changed

13 files changed

+719
-15
lines changed

docs/apidocs/examples.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,3 +115,19 @@ These examples all live in ``./examples`` in the source-distribution of RDFLib.
115115
:undoc-members:
116116
:show-inheritance:
117117

118+
:mod:`~examples.secure_with_audit` Module
119+
-----------------------------------------
120+
121+
.. automodule:: examples.secure_with_audit
122+
:members:
123+
:undoc-members:
124+
:show-inheritance:
125+
126+
127+
:mod:`~examples.secure_with_urlopen` Module
128+
-------------------------------------------
129+
130+
.. automodule:: examples.secure_with_urlopen
131+
:members:
132+
:undoc-members:
133+
:show-inheritance:

docs/index.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,18 @@ RDFLib is a pure Python package for working with `RDF <http://www.w3.org/RDF/>`_
2626

2727
* both Queries and Updates are supported
2828

29+
.. caution::
30+
31+
RDFLib is designed to access arbitrary network and file resources, in some
32+
cases these are directly requested resources, in other cases they are
33+
indirectly referenced resources.
34+
35+
If you are using RDFLib to process untrusted documents or queries you should
36+
take measures to restrict file and network access.
37+
38+
For information on available security measures, see the RDFLib
39+
:doc:`Security Considerations </security_considerations>`
40+
documentation.
2941

3042
Getting started
3143
---------------
@@ -56,6 +68,7 @@ If you are familiar with RDF and are looking for details on how RDFLib handles i
5668
merging
5769
upgrade5to6
5870
upgrade4to5
71+
security_considerations
5972

6073

6174
Reference

docs/security_considerations.rst

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
.. _security_considerations: Security Considerations
2+
3+
=======================
4+
Security Considerations
5+
=======================
6+
7+
RDFLib is designed to access arbitrary network and file resources, in some cases
8+
these are directly requested resources, in other cases they are indirectly
9+
referenced resources.
10+
11+
An example of where indirect resources are access is JSON-LD processing, where
12+
network or file resources referenced by ``@context`` values will be loaded and
13+
processed.
14+
15+
RDFLib also supports SPARQL, which has federated query capabilities that allow
16+
queries to query arbitrary remote endpoints.
17+
18+
If you are using RDFLib to process untrusted documents or queries you should
19+
take measures to restrict file and network access.
20+
21+
Some measures that can be taken to restrict file and network access are:
22+
23+
* `Operating System Security Measures`_.
24+
* `Python Runtime Audit Hooks`_.
25+
* `Custom URL Openers`_.
26+
27+
Of these, operating system security measures are recommended. The other
28+
measures work, but they are not as effective as operating system security
29+
measures, and even if they are used they should be used in conjunction with
30+
operating system security measures.
31+
32+
Operating System Security Measures
33+
==================================
34+
35+
Most operating systems provide functionality that can be used to restrict
36+
network and file access of a process.
37+
38+
Some examples of these include:
39+
40+
* `Open Container Initiative (OCI) Containers
41+
<https://www.opencontainers.org/>`_ (aka Docker containers).
42+
43+
Most OCI runtimes provide mechanisms to restrict network and file access of
44+
containers. For example, using Docker, you can limit your container to only
45+
being access files explicitly mapped into the container and only access the
46+
network through a firewall. For more information refer to the
47+
documentation of the tool you use to manage your OCI containers:
48+
49+
* `Kubernetes <https://kubernetes.io/docs/home/>`_
50+
* `Docker <https://docs.docker.com/>`_
51+
* `Podman <https://podman.io/>`_
52+
53+
* `firejail <https://firejail.wordpress.com/>`_ can be used to
54+
sandbox a process on Linux and restrict its network and file access.
55+
56+
* File and network access restrictions.
57+
58+
Most operating systems provide a way to restrict operating system users to
59+
only being able to access files and network resources that are explicitly
60+
allowed. Applications that process untrusted input could be run as a user with
61+
these restrictions in place.
62+
63+
Many other measures are available, however, listing them outside the scope
64+
of this document.
65+
66+
Of the listed measures OCI containers are recommended. In most cases, OCI
67+
containers are constrained by default and can't access the loopback interface
68+
and can only access files that are explicitly mapped into the container.
69+
70+
Python Runtime Audit Hooks
71+
==========================
72+
73+
From Python 3.8 onwards, Python provides a mechanism to install runtime audit
74+
hooks that can be used to limit access to files and network resources.
75+
76+
The runtime audit hook system is described in more detail in `PEP 578 – Python
77+
Runtime Audit Hooks <https://peps.python.org/pep-0578/>`_.
78+
79+
Runtime audit hooks can be installed using the `sys.addaudithook
80+
<https://docs.python.org/3/library/sys.html#sys.addaudithook>`_ function, and
81+
will then get called when audit events occur. The audit events raised by the
82+
Python runtime and standard library are described in Python's `audit events
83+
table <https://docs.python.org/3/library/audit_events.html>`_.
84+
85+
RDFLib uses `urllib.request.urlopen` for HTTP, HTTPS and other network access,
86+
and this function raises a ``urllib.Request`` audit event. For file access,
87+
RDFLib uses `open`, which raises an ``open`` audit event.
88+
89+
Users of RDFLib can install audit hooks that react to these audit events and
90+
raises an exception when an attempt is made to access files or network resources
91+
that are not explicitly allowed.
92+
93+
RDFLib's test suite includes tests which verify that audit hooks can block
94+
access to network and file resources.
95+
96+
RDFLib also includes an example that shows how runtime audit hooks can be
97+
used to restrict network and file access in :mod:`~examples.secure_with_audit`.
98+
99+
Custom URL Openers
100+
==================
101+
102+
RDFLib uses the `urllib.request.urlopen` for HTTP, HTTPS and other network
103+
access. This function will use a `urllib.request.OpenerDirector` installed with
104+
`urllib.request.install_opener` to open the URLs.
105+
106+
Users of RDFLib can install a custom URL opener that raise an exception when an
107+
attempt is made to access network resources that are not explicitly allowed.
108+
109+
RDFLib's test suite includes tests which verify that custom URL openers can be
110+
used to block access to network resources.
111+
112+
RDFLib also includes an example that shows how a custom opener can be used to
113+
restrict network access in :mod:`~examples.secure_with_urlopen`.

examples/secure_with_audit.py

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
"""
2+
This example demonstrates how to use `Python audit hooks
3+
<https://docs.python.org/3/library/sys.html#sys.addaudithook>`_ to block access
4+
to files and URLs.
5+
6+
It installs a audit hook with `sys.addaudithook <https://docs.python.org/3/library/sys.html#sys.addaudithook>`_ that blocks access to files and
7+
URLs that end with ``blocked.jsonld``.
8+
9+
The code in the example then verifies that the audit hook is blocking access to
10+
URLs and files as expected.
11+
"""
12+
13+
import logging
14+
import os
15+
import sys
16+
from typing import Any, Optional, Tuple
17+
18+
from rdflib import Graph
19+
20+
21+
def audit_hook(name: str, args: Tuple[Any, ...]) -> None:
22+
"""
23+
An audit hook that blocks access when an attempt is made to open a
24+
file or URL that ends with ``blocked.jsonld``.
25+
26+
Details of the audit events can be seen in the `audit events
27+
table <https://docs.python.org/3/library/audit_events.html>`_.
28+
29+
:param name: The name of the audit event.
30+
:param args: The arguments of the audit event.
31+
:return: `None` if the audit hook does not block access.
32+
:raises PermissionError: If the file or URL being accessed ends with ``blocked.jsonld``.
33+
"""
34+
if name == "urllib.Request" and args[0].endswith("blocked.jsonld"):
35+
raise PermissionError("Permission denied for URL")
36+
if name == "open" and args[0].endswith("blocked.jsonld"):
37+
raise PermissionError("Permission denied for file")
38+
return None
39+
40+
41+
def main() -> None:
42+
"""
43+
The main code of the example.
44+
45+
The important steps are:
46+
47+
* Install a custom audit hook that blocks some URLs and files.
48+
* Attempt to parse a JSON-LD document that will result in a blocked URL being accessed.
49+
* Verify that the audit hook blocked access to the URL.
50+
* Attempt to parse a JSON-LD document that will result in a blocked file being accessed.
51+
* Verify that the audit hook blocked access to the file.
52+
"""
53+
54+
logging.basicConfig(
55+
level=os.environ.get("PYTHON_LOGGING_LEVEL", logging.INFO),
56+
stream=sys.stderr,
57+
datefmt="%Y-%m-%dT%H:%M:%S",
58+
format=(
59+
"%(asctime)s.%(msecs)03d %(process)d %(thread)d %(levelno)03d:%(levelname)-8s "
60+
"%(name)-12s %(module)s:%(lineno)s:%(funcName)s %(message)s"
61+
),
62+
)
63+
64+
if sys.version_info < (3, 8):
65+
logging.warn("This example requires Python 3.8 or higher")
66+
return None
67+
68+
# Install the audit hook
69+
#
70+
# note on type error: This is needed because we are running mypy with python
71+
# 3.7 mode, so mypy thinks the previous condition will always be true.
72+
sys.addaudithook(audit_hook) # type: ignore[unreachable]
73+
74+
graph = Graph()
75+
76+
# Attempt to parse a JSON-LD document that will result in the blocked URL
77+
# being accessed.
78+
error: Optional[PermissionError] = None
79+
try:
80+
graph.parse(
81+
data=r"""{
82+
"@context": "http://example.org/blocked.jsonld",
83+
"@id": "example:subject",
84+
"example:predicate": { "@id": "example:object" }
85+
}""",
86+
format="json-ld",
87+
)
88+
except PermissionError as caught:
89+
logging.info("Permission denied: %s", caught)
90+
error = caught
91+
92+
# `Graph.parse` would have resulted in a `PermissionError` being raised from
93+
# the audit hook.
94+
assert isinstance(error, PermissionError)
95+
assert error.args[0] == "Permission denied for URL"
96+
97+
# Attempt to parse a JSON-LD document that will result in the blocked file
98+
# being accessed.
99+
error = None
100+
try:
101+
graph.parse(
102+
data=r"""{
103+
"@context": "file:///srv/blocked.jsonld",
104+
"@id": "example:subject",
105+
"example:predicate": { "@id": "example:object" }
106+
}""",
107+
format="json-ld",
108+
)
109+
except PermissionError as caught:
110+
logging.info("Permission denied: %s", caught)
111+
error = caught
112+
113+
# `Graph.parse` would have resulted in a `PermissionError` being raised from
114+
# the audit hook.
115+
assert isinstance(error, PermissionError)
116+
assert error.args[0] == "Permission denied for file"
117+
118+
119+
if __name__ == "__main__":
120+
main()

examples/secure_with_urlopen.py

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
"""
2+
This example demonstrates how to use a custom global URL opener installed with `urllib.request.install_opener` to block access to URLs.
3+
"""
4+
import http.client
5+
import logging
6+
import os
7+
import sys
8+
from typing import Optional
9+
from urllib.request import HTTPHandler, OpenerDirector, Request, install_opener
10+
11+
from rdflib import Graph
12+
13+
14+
class SecuredHTTPHandler(HTTPHandler):
15+
"""
16+
A HTTP handler that blocks access to URLs that end with "blocked.jsonld".
17+
"""
18+
19+
def http_open(self, req: Request) -> http.client.HTTPResponse:
20+
"""
21+
Block access to URLs that end with "blocked.jsonld".
22+
23+
:param req: The request to open.
24+
:return: The response.
25+
:raises PermissionError: If the URL ends with "blocked.jsonld".
26+
"""
27+
if req.get_full_url().endswith("blocked.jsonld"):
28+
raise PermissionError("Permission denied for URL")
29+
return super().http_open(req)
30+
31+
32+
def main() -> None:
33+
"""
34+
The main code of the example.
35+
36+
The important steps are:
37+
38+
* Install a custom global URL opener that blocks some URLs.
39+
* Attempt to parse a JSON-LD document that will result in a blocked URL being accessed.
40+
* Verify that the URL opener blocked access to the URL.
41+
"""
42+
43+
logging.basicConfig(
44+
level=os.environ.get("PYTHON_LOGGING_LEVEL", logging.INFO),
45+
stream=sys.stderr,
46+
datefmt="%Y-%m-%dT%H:%M:%S",
47+
format=(
48+
"%(asctime)s.%(msecs)03d %(process)d %(thread)d %(levelno)03d:%(levelname)-8s "
49+
"%(name)-12s %(module)s:%(lineno)s:%(funcName)s %(message)s"
50+
),
51+
)
52+
53+
opener = OpenerDirector()
54+
opener.add_handler(SecuredHTTPHandler())
55+
install_opener(opener)
56+
57+
graph = Graph()
58+
59+
# Attempt to parse a JSON-LD document that will result in the blocked URL
60+
# being accessed.
61+
error: Optional[PermissionError] = None
62+
try:
63+
graph.parse(
64+
data=r"""{
65+
"@context": "http://example.org/blocked.jsonld",
66+
"@id": "example:subject",
67+
"example:predicate": { "@id": "example:object" }
68+
}""",
69+
format="json-ld",
70+
)
71+
except PermissionError as caught:
72+
logging.info("Permission denied: %s", caught)
73+
error = caught
74+
75+
# `Graph.parse` would have resulted in a `PermissionError` being raised from
76+
# the url opener.
77+
assert isinstance(error, PermissionError)
78+
assert error.args[0] == "Permission denied for URL"
79+
80+
81+
if __name__ == "__main__":
82+
main()

0 commit comments

Comments
 (0)