-
Notifications
You must be signed in to change notification settings - Fork 1
Feat: pems_data
caching
#194
Conversation
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
bf7fa41
to
ba09852
Compare
ae76c58
to
e6efa3f
Compare
28ccf15
to
8eee59f
Compare
e6efa3f
to
b155365
Compare
2921d39
to
333392d
Compare
333392d
to
826539e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks and works great! 🎉
I was able to follow the logic and it makes sense. At first I overlooked that CachingDataSource.read()
(specifically, self.cache.set_df(cache_key, df, ttl=ttl)
) will set a cache key (store a result) if it does not find one when reading the first time. Since I kept running the code with the same redis
container and volume, I was confused about when the cache had been set, but after realizing this, it all made sense. The cache keys only get set "on demand" as data is read, we don't cache data that a service has not asked for yet (this makes so much sense, it's essentially the whole purpose of a cache but it took me a bit to realize it 😅).
I only saw one typo below, everything else looks great!
compose.yml
Outdated
redis: | ||
image: redis:8 | ||
ports: | ||
- "${REDIS_LOCAL_PORT:-6379}:6379" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably change REDIS_LOCAL_PORT
to REDIS_PORT
to match .env.sample
and the environment variable loading in cache.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arg, thanks for catching this typo! I must have renamed it at some point and missed this one.
826539e
to
ba373c0
Compare
uses an underlying redis connection
$ pems-cache -h usage: pems-cache [-h] [--key KEY] [--value VALUE] [{check,get,set}] Simple CLI for the cache positional arguments: {check,get,set} the operation to perform options: -h, --help show this help message and exit --key/-k KEY the item's key, required for get/set --value/-v VALUE the item's value, required for set
- accept a host and port parameter - pass additional kwargs to the redis.Redis() instance - catch redis.ConnectionError and return None
we don't need to connect to the cache backend immediately, delay that internally via a helper _connect() function for is_available, get, and set -- ensure the backend is connected and available before attempting to use
convert DataFrames to and from Arrow IPC bytes
use the DataFrame serialization functions as mutators
so we don't have to implement the get/check/set dance every time
seconds until expiration
for callers passing e.g. key and TTL
foward along to the underlying data source
ba373c0
to
1f4d38e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Closes #198
What this PR does
See #196 for docs that describe
pems_data
architecture and usageIntroduces a caching layer for
pems_data
that is mostly transparent for users/callers ofpems_data
functionality.The main caching interface is defined in the
pems_data.cache.Cache
class, which is a wrapper around aredis
connection.redis
is added as a local Compose service, and thepems-cache
CLI is added to help debug and inspect the cache.pems_data.cache.Cache
classredis
get
andset
wrappersget
/set
DataFrames from the cachepems_data.sources.cache.CachingDataSource
classAn
IDataSource
implementation (see #187) that wraps anotherIDataSource
and calls out to the cache first before performing any needed.read()
operations on the underlying data source. Data returned from.read()
is cached before returning.pems_data.ServiceFactory
classWith the introduction of the caching layer, the services (e.g.
StationsService
) need additional wiring and setup. This class wraps that setup so callers can get a service that is ready to use. This class should usually serve as the main entrypoint intopems_data
for callers.