Skip to content

Simple in memory data cache designed for ML applications. Built using Redis and Apache Arrow's Plasma in-memory store

Notifications You must be signed in to change notification settings

jchacks/data_cache

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Cache

Simple in memory data cache designed for local non distributed ML applications. Built using Redis and Apache Arrow's Plasma in-memory store.

Installation

Install using pip: pip install git+https://github.com/jchacks/data_cache.git

Prerequisites

There are a few python packages that are required.

  • Pyarrow
  • Redis

Along with a running Redis server for the message queue.

Usage

Server

from data_cache import PlasmaServer

s = PlasmaServer(100000000) # 100MB
s.start()
s.wait()

# The location of the plasma store will be printed
# e.g. '/tmp/plasma-qd3yeugu/plasma.sock'
# This location is also added to the Redis store 
# so clients can automatically find it.

Data Producing Client

from data_cache import Client

# Ensure the `namespace` is the same everywhere the data is needed to be accessed
c = Client() 
q = c.make_queue('plasma', None)
# Put some dummy data into the queue
import numpy as np 

for i in range(10):
    r = q.put(np.ones((100000,)).astype('float32') * i)

Data Consuming Client

from data_cache import Client

c = Client()
q = c.make_queue('plasma', None) # Use the same name as above

# Fetch data off the queue using c.get()
import numpy as np 
d = np.stack([q.get() for i in range(10)])
print(d) 

# This will print the numpy array of 
# concatenated data in order 1->10

Setting persistant data on the store

import numpy as np 
from data_cache import Client

c = Client()
generic = c.get_or_create_store('generic')
generic['abc'] = np.ones((100000,)).astype('float32')

# This will access the data and not remove it from plasma
print(generic['abc'])

About

Simple in memory data cache designed for ML applications. Built using Redis and Apache Arrow's Plasma in-memory store

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages