Data Distribution

Data Distribution at TransferChain

Understanding Data Distribution

Using the application, the user selects the file to store. For security, the file is first encrypted on the user's device before any transmission takes place. Various types of encryption schemes can be used. A private/public key encryption scheme has particular advantages. Using such a scheme, the file is first encrypted with the user’s private key, which is only available on their local device.

The encrypted file is stored on the user’s device as a temporary file. In a particular embodiment, encryption is done via AES with SHA512 HMAC method with a random 32-byte key (byte keys can be augmented according to the systems performance on different devices with different computational powers) and the Key is generated from a cryptographically secure random number generator. The user app then splits the file into multiple parts or chunks. In an embodiment, the chunks are each fixed by us to 32 KB in size to simplify encryption and preservation of file integrity. However, different sizes can be used and not all chunks need to be the same size. The last chunk can be padded to bring its size to the set chunk side as required.

Conventional Techniques are Vulnerable

Conventional techniques can be used to indicate the total number and order of chunks for use in file reconstruction. The file is then uploaded to the service in chunks via the network. After each chunk has been successfully uploaded, the file is complete and the temporary file on the user device can be deleted. There is no raw data circulated or stored on the service processor. This is a proxy-like intermediary service area that only processes encrypted parsed data and impossible to identify any data because it is encrypted and parsed in random order. Alternatively, the encrypted file could be transferred to the service server and chunking of the file implemented on the server side. Implementing the function on the client side will reduce the processing requirements of the server, reducing back-end operating costs.

Each chunk of the uploaded file gets stored in a storage node; the set of nodes selected for file storage. Each node may be used to store one or more chunks. Various techniques can be used to distribute the file in parts to the multiple storage nodes, and chunk distribution need not be evenly done across the various nodes. In general, distribution is dependent at least on file size, the number of chunks, and the number of nodes.

File chunks are allocated, e.g., in a random order, to storage ‘slots’ in the service, and the slotted data is pushed to the data nodes. The service will store details reflecting which chunk is in each slot (such as by reference to a chunk number) and which node and node storage address that slot is associated with. A buffer is used to temporarily store encrypted chunks and other data required for this process. If there is an error with the provider, the service can cancel out that operation and try again, perhaps with a different node. Once the encrypted data is sent to the providers, the buffer will be cleaned.

Technical Overview of Data Distribution

PreviousArchitecture NextImmutability

Last updated 19 days ago