List / Sub-Table Friendly Operations #22
Replies: 9 comments
-
Hey again Victor, What do you mean by list is this case? I am also having trouble understanding what you mean here:
|
Beta Was this translation helpful? Give feedback.
-
basically what I was getting it is some "list friendly" operations. so if you want to build a list datatype overtop the key/value interface, you're going to have to spread out the values in that list over many key/value pairs as it grows beyond the HSE_KVS_VLEN_MAX limit of one. I write a list blueprint at the key for the list, "listkeyname", and then grow list entries into segments like "listkeyname.0", "listkeyname.1", "listkeyname.2" etc and just some general observations I had while implementing this. one super convenient operation would be renaming keys in a KVS (with overwriting allowed). take the situation where I've now deleted all the values in "listkeyname.1", and want to move "listkeyname.2" into "listkeyname.1". at present I have to pull the value of "listkeyname.2" out, delete it, and then write it into "listkeyname.1". but theoretically I could just do an overwriting rename operation. another as I mentioned in the first posting, would be the ability to delete by prefix in a KVS of
I do have some important lists and subtables (again, logical objects i've built over the key/value interface) in hot paths, that I've put into their own KVSs, but I have tons more lists that reside in a general purpose KVS of if any others come to mind i'll share them. |
Beta Was this translation helpful? Give feedback.
-
If I am understanding you correctly @victorstewart, My colleagues might have more to add on this subject, but at present and probably for always a KV tuple is immutable. The key cannot change and the value cannot change for that specific tuple. You can call You seem really familiar with KV stores though. Do you have any APIs for other KV stores that implement this kind of idea or other list-like APIs in general? As for your last idea of Have you read the KVS/prefix section in our best practices section of the wiki? https://github.com/hse-project/hse/wiki/Best-Practices#keys-and-kvs Are you hitting the limit of maximum amount of KVSs allowed in a KVDB by chance? |
Beta Was this translation helpful? Give feedback.
-
@victorstewart - I think it would be helpful to rewind to the problem you are actually looking to solve. The combination of that problem and the form of the HSE API may well have sent you down one path while another might better address your need. If that's not the case, then we can help with either using what facilities are there or take into consideration extensions. What can you say about the following:
Answers to these things, and related items that the above questions bring to mind for you, will help. |
Beta Was this translation helpful? Give feedback.
-
right I know that on the prefixes, I assume that when no prefix is set, that the hashed key values are uniformly distributed about the keyspace, but a prefix would provide no better performance than individually getting then putting each value at a new key. again just thought it worth mentioning. so the reason I built the list like this is because practically all items will be below the and for the time being I artificially limited all values stored in my database to at the moment I'm just using one KVDB, but of course could grow to 8, and 16 KVSs per. and i'm not hitting any limits yet. as i stated i have some very important hotpath lists and subtables that i've created custom KVSs for, to exploit the prefix performance benefits on access. but in the general case i can't be creating custom KVSs per list or subtable. that said the performance with the current API is fantastic regardless! just thought it worth mentioning the subject of creating logic data structures on top of key/value pairs. |
Beta Was this translation helpful? Give feedback.
-
important to note firstly that, in probably MOST cases, every hot list not important enough for it's own KVS will be able to fit all its items into a single and all these points almost equally apply to the concept of subtables, where you must spread each sub entry onto its own key, thus the logical object grows over many more keys than than a list. but in those cases really the only "missing" abilities would be delete all or get all, (aka prefixed based operations when a KVS has not defined prefix). (and of course again I have some very important ones that get their own KVSs). (and obviously a "get all" isn't as practical as a delete all given the needed receive buffer size is unknown). both delete all and get all currently require a slow cursor iteration. but again, this is fine because if any where in a hot enough path you just upgrade them to their own KVS. so i only provide the below responses because maybe they trigger some productive thoughts, but as far as I can see, all are satisfied by the current interface.
the larger they are the colder they are. so i have some lists that will grow by maybe +100K items every 3 years. fitting about 10K values per other lists and subtables that are probed often to semi-often (but not frequently) are no more than dozens to 100s of items.
either in the range of 100B-200B, or 2KB-5KB.
a dozen or so application wide lists. ballpark of 15 (lists + subtables) per user. plus another batch with variable scaling (will begin around 300 per user approaching 1 per user over infinite time).
all either permanent (95%+) or semi-permanent.
insert to head or tail. random deletes by value. delete from head or tail. delete all.
get all. get all and delete all. getting n items from the head or tail. checking for the existence of an item in a list, so iterative from head. |
Beta Was this translation helpful? Give feedback.
-
@victorstewart if I understand correctly, you’re mapping Redis semantics to HSE and this question is specifically around implementing Redis lists, correct? I am far from being a Redis expert, but after reading the Redis List description let me propose an example mapping to HSE, and then outline a potential optimization. In doing so, I’m using the notation established in the “Concepts” section of the HSE wiki. We define three KVS, two of which (MetaKvs and KeyKvs) you’ll all use to support all Redis data types in your system, and one (ListKvs) for storing all Redis list data. MetaKvs stores various metadata values for implementing your system. MetaKvs stores KV pairs with unsegmented keys (pfx_len=0) where the key is the name of the piece of metadata, and the value is the current value for that piece of metadata. In the examples below, we’ll store the key “keyID-next”, whose value is a uint64 initially set (put) to 0 when the system is first initialized. KeyKvs maps Redis keys to a unique identifier. We do this both to support rename and to save storage space, as will become clear. KeyKvs stores KV pairs with unsegmented keys (pfx_len=0) where the key is just the Redis key, and the value is a tuple (keyID, keyType), where keyID is the unique identifier for that Redis key, and keyType is its Redis data type (e.g., “L” for list). To generate keyID, we get, increment, and put (in a transaction) the value for key “keyID-next” in MetaKvs. ListKvs contains the data for all Redis lists. ListKvs stores KV pairs with multi-segment keys of the form (listID, typeID, elemID) where
Values for KV pairs in ListKvs are defined as follows
Furthermore
EXAMPLE: to create a new Redis list whose Redis key is “mylist” and whose only element is “a” (e.g., LPUSH mylist a), do all the following in a transaction.
Above we are initializing the new listID in ListKvs. I am starting elemID values at 100 to leave (lots) of room for more distinguished keys in ListKvs if needed in the future.
Above we are storing the Redis list element. Since this is the only element in the list, we set its (prevID, nextID) to (0, 0) indicating no previous or next element in the list. EXAMPLE: to add element “b” to the tail of existing Redis list “mylist” (e.g., RPUSH mylist b), do all the following in a transaction.
Above we get an identifier for the new element and increment it for future use.
Above we add the new element to ListKvs with its prevID pointing to the former tail element (tailID)
Above we make the former tail element point to the new tail (newID)
Above we update the distinguished key pointing to the tail of list listID. EXAMPLE: to rename Redis list “mylist” to “todolist” (e.g., RENAME mylist todolist), do all the following in a transaction.
In an actual implementation of rename you need to check for “todolist” in KeyKvs and if it exists then delete all related data as determined by its Redis data type. You want to structure your KVS key prefixes so you can do this with a single prefix delete. For example, if “todolist” were an existing Redis list, we could delete all associated data with a single prefix-delete(todolistID) in ListKvs. The above examples use a doubly linked list of elements, which the Redis docs imply is how they implement it. There are numerous other ways to structure this data to potentially achieve better performance, such as grouping (prevID, nextID) within a single value. The main point here is to demonstrate how to use KVS and structure keys in HSE. |
Beta Was this translation helpful? Give feedback.
-
@smoyergh wow thank you that was exceedingly thorough, let me digest it and see if i have any points to add. but you're right adding those layers of KVS indirection is the only way to ensure the type safety of logical types built over key-value space. also collapsing to a constant length key blueprint as you suggest does solve all the prefix based performance i was missing... fantastic! and again that's clearly the only way to do so generically without relying on KVS spread + integrating a brittle KVS design based on "hard coded" data assumptions. i once heard someone said every problem in computer science can be solved by another layer of indirection, and now i'm always reminded of that. currently, I didn't build in any type safety, just assumed that the application will always do the right thing, plus instituted naming conventions for keys that will assure no interference between anything, but as my team grows i'll definitely circle back and reimplemented the database as you've laid out for data integrity reasons. |
Beta Was this translation helpful? Give feedback.
-
@smoyergh at the moment i implemented subtables as... "HSET key subkey value" just maps to "key.subkey" -> value and lists where... a blueprint exists at the list key, what holds a list count plus head and tail suffixes. those suffixes map to segments. say the head at key "listkey.0" and the tail at "listkey.2". and i fill entries into those segments until the entire 1MB is consumed, and then the overflow into a new segment (like those water doors the have in the hull of a ship to arrest contagion). i mapped out all the usage in mind, and the minimal indirection and lack of transactions or cursor construction in my design actually optimizes for the fast path, at the expense of the type safety and vast elegance of your design. as far as lists go, it optimizes for the fast path since almost every time the operation can be served by the first segment (head or tail) gathered. and when it can't, there's a 1 time cost of overflow. or at worst served by seeking to the next segment (in the case of deleting the first instance of a value found). this also makes fully consuming a non head or tail segment, thus requiring "renaming" aka compaction, exceedingly rare. for subtables, the only time i do a "get all values" (besides for 2 hyper specialized collections of data which thus require custom commands to transfer intent efficiently, thus personal KVSs are no added complexity), is one super cold path.. thus a non frequent performance cost. that said, i'm sure we will eventually reimplement to your specification one people must less cavalier than me urge safety over fast path optimized design. thank you so much again. V |
Beta Was this translation helpful? Give feedback.
-
so
hse_kvs_prefix_delete
ispfx_len
restricted, and that's great when a given subset of data is important enough to warrant it's own KVS. but in the general case of many lists all of varying prefix length, spanning multiple keys, as pressured by theHSE_KVS_VLEN_MAX
value, it would be great to have a way to just prefix delete an entire list.or maybe that's not technically practical?
i'd make the same point about a "variable-prefix read" (or cursor), but also assume there's a reason these abilities don't exist.
regardless thought it warranted a discussion?
Beta Was this translation helpful? Give feedback.
All reactions