Skip to content

Commit 44ee362

Browse files
committed
Add support for newer and older
1 parent 58e9111 commit 44ee362

File tree

6 files changed

+72
-4
lines changed

6 files changed

+72
-4
lines changed

docs/manual/access-control.rst

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ An .aclj file may look as follows::
9595

9696
Each JSON entry contains an ``access`` field and the original ``url`` field that was used to convert to the SURT (if any).
9797

98-
The JSON entry may also contain ``user``, ``before``, and ``after`` fields, as explained below.
98+
The JSON entry may also contain ``user``, ``before``, ``after``, ``newer``, and ``older`` fields, as explained in the sections below.
9999

100100
The prefix consists of a SURT key and a ``-`` (currently reserved for a timestamp/date range field to be added later).
101101

@@ -166,10 +166,10 @@ Further examples of how to set this header will be provided in the deployments s
166166
See the :ref:`config-acl-header` section in Usage for examples on how to configure this header.
167167

168168

169-
Date-Based Access Controls
170-
^^^^^^^^^^^^^^^^^^^^^^^^^^
169+
Date-Based Access Controls: Before/After Exact Date
170+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
171171

172-
The access control rules can further be customized be specifying different permissions based on capture timestamp, using ``before`` and ``after`` fields that operate in the same manner as their embargo counterparts for a specific URL or domain.
172+
It is also possible to control access based on capture timestamp, using ``before`` and ``after`` fields to specify an exact timestamp.
173173

174174
For example, the following access control settings restrict access to ``https://example.com/restricted/`` by default, but allow access for captures prior to December 1, 2010::
175175

@@ -183,6 +183,24 @@ Combined with the embargo settings, this can also be used to override the embarg
183183
com,example)/restricted - {"access": "allow"}
184184

185185

186+
Date-Based Access Controls: Time Interval
187+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
188+
189+
Access can also be controlled by specifying a relative time interval, similar to embargos.
190+
191+
For example, the following access control settings restrict access to ``https://example.com/restricted/`` by default, but allow access to all captures newer than 1 year::
192+
193+
com,example)/restricted - {"access": "allow", "older": {"years": 1}}
194+
com,example)/restricted - {"access": "block"}
195+
196+
The following access control settings restrict access to ``https://example.com/restricted/`` by default, but allow access to all captures older than 1 year, 2 months, 3 weeks, and 4 days::
197+
198+
com,example)/restricted - {"access": "allow", "older": {"years": 1}, "months": 2, "weeks": 3, "days": 4}
199+
com,example)/restricted - {"access": "block"}
200+
201+
Any combination of years, months, weeks and days can be used (as long as at least one is provided) for the ``newer`` or ``older`` access control settings.
202+
203+
186204
Access Error Messages
187205
^^^^^^^^^^^^^^^^^^^^^
188206

pywb/warcserver/access_checker.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,28 @@ def check_date_access(
196196
after = timestamp_to_datetime(after_ts, tz_aware=True)
197197
return access if dt > after else default_access
198198

199+
newer = rule.get('newer')
200+
if newer:
201+
delta = relativedelta(
202+
years=newer.get('years', 0),
203+
months=newer.get('months', 0),
204+
weeks=newer.get('weeks', 0),
205+
days=newer.get('days', 0)
206+
)
207+
actual = datetime.now(timezone.utc) - delta
208+
return access if actual < dt else default_access
209+
210+
older = rule.get('older')
211+
if older:
212+
delta = relativedelta(
213+
years=older.get('years', 0),
214+
months=older.get('months', 0),
215+
weeks=older.get('weeks', 0),
216+
days=older.get('days', 0)
217+
)
218+
actual = datetime.now(timezone.utc) - delta
219+
return access if actual > dt else default_access
220+
199221
return access
200222

201223
def create_access_aggregator(self, source_files):

sample_archive/access/newer.aclj

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
org,iana)/ - {"access": "allow", "url": "http://www.iana.org/", "newer": {"years": 1, "months": 6}}

sample_archive/access/older.aclj

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
org,iana)/ - {"access": "allow", "url": "http://www.iana.org/", "older": {"years": 1}}

tests/config_test_access.yaml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,20 @@ collections:
7676
acl_paths:
7777
- ./sample_archive/access/after.aclj
7878

79+
pywb-acl-newer:
80+
index_paths: ./sample_archive/cdx/
81+
archive_paths: ./sample_archive/warcs/
82+
default_access: block
83+
acl_paths:
84+
- ./sample_archive/access/newer.aclj
85+
86+
pywb-acl-older:
87+
index_paths: ./sample_archive/cdx/
88+
archive_paths: ./sample_archive/warcs/
89+
default_access: block
90+
acl_paths:
91+
- ./sample_archive/access/older.aclj
92+
7993
pywb-wildcard-surt:
8094
index_paths: ./sample_archive/cdx/
8195
archive_paths: ./sample_archive/warcs/

tests/test_acl.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,3 +114,15 @@ def test_acl_after(self):
114114
assert 'Access Blocked' in resp.text
115115

116116
resp = self.testapp.get('/pywb-acl-after/20140127171238mp_/http://www.iana.org/', status=200)
117+
118+
def test_acl_newer(self):
119+
resp = self.testapp.get('/pywb-acl-newer/20140127171238mp_/http://www.iana.org/', status=451)
120+
assert 'Access Blocked' in resp.text
121+
122+
resp = self.testapp.get('/pywb-acl-newer/20140126200624mp_/http://www.iana.org/', status=451)
123+
assert 'Access Blocked' in resp.text
124+
125+
def test_acl_older(self):
126+
resp = self.testapp.get('/pywb-acl-older/20140127171238mp_/http://www.iana.org/', status=200)
127+
128+
resp = self.testapp.get('/pywb-acl-older/20140126200624mp_/http://www.iana.org/', status=200)

0 commit comments

Comments
 (0)