-
Notifications
You must be signed in to change notification settings - Fork 1
Home
launch one MDS with a machines file, it will start all DSs for you via ssh then
query the state of the system (list of files, list and placement of chunks)
add a local file into the system
put (local) then remote get
retrieve a file from the system get = extract . fetch First you fetch the file to the local DS, then you extract it
create a soft link to a file in the local data store
get with no extract at the end
the CLI asks a remote DS to get
catenate several files into one
stop the system
command line interface (CLI) started on demand with the command to execute, MDS to connect to (host, port) and local DS to connect to (host, port)
metadata server (MDS), it is a daemon, only one
data server (DS), it is a deamon, one per machine in the machines file
ls interrogates the metadata server only.
put adds a file into the system (publish).
The local DS to which the CLI is connected cuts the file into chunks and add them to the local datastore. Then it notifies the MDS about this file and its chunks. The operation may have to be rolled back on the DS side in case the MDS answers there is already such a file (filenames are unique).
get = extract . fetch
fetch = get file into the local DS (or fail) but don't extract it
extract = extract a file from the local DS (default to soft link only, option for copy)
get retrieves a file. The file is made of several chunks that are distributed between DSs. The metadata server know where these chunks are.
As a consequence, the client must first ask the metadata server where the chunksare, and then download the chunks and assemble them. FBR: there is a use case which is inefficient in case all DSs ask for the them file. Maybe we should query the MDS before each chunk, or at least every few chunks. Maybe the MDS needs to be able to compute very fast delta of states. As soon as a file has one more chunk available somewhere, the state of this file (maybe an int) is increased. DS should be able to ask for a diff (if any) between the last state it knows about for a given file and the current one the MDS knows.
send a file to several hosts
similar to several DSs asking for the same file at the same time, probably easier to implement efficiently since some very efficient algorithm can be used