Storage 101 : Object storage and Block storage
Bock storage:
The end user accesses block storage over the network as a normal local disk on the machine
Access to the data is done directly, not through a filesystem resulting in a better performance
Tasks that requires computation like handling locks, shared access,.., is managed by the client.
A block has an address but it lacks the metadata we find in file storage for example - type of data, owner, rights, ... -
Tasks that requires computation like handling locks, shared access,.., is managed by the client.
A block has an address but it lacks the metadata we find in file storage for example - type of data, owner, rights, ... -
It is not used for backup and archiving but it is used mostly for application that need a high performance in terms of reading and writing operations such as databases.
We can expand a block storage by adding more disks or more machines, but they can't span big geographical area since that may harm performance
Object Storage:
Buckets as opposed to blocks in block storage can live anywhere, there are no constraints on distance as it is the case with block storage.
Object has the below attributes:
- data : content of the objects
- unique ID : identifier that removes the need for having paths as it is the case for block data.
- metadata : is a description of the content of the objects that could be used for indexing, sorting, ...
We can simply use the metadata to add information about replication to tag the buckets as "replicable to location X" or "in need of replication X times" for example.
The UID is used as a "filename" to search for and locate objects.
Another feature other than the replication, is erasure coding.
Another feature other than the replication, is erasure coding.
Erasure coding means splitting a big object into smaller units.
With code erasure parts of an object could live in different machines with the assurance that if one machine fails that data can be reconstructed.
Object storage has also the below protection characteristics:
- Versioning: where older objects are saved if changes are applied to them, so we could restore our old data if need be.
- Concurrent access: multiple versions are created for each "writing" user or application.
Object storage is not ideal for data that is changed often by different users like Excel sheets for example.
If is normally used as a backup storage where there is no urgent need for high performance.
If is normally used as a backup storage where there is no urgent need for high performance.
Object storage uses RESTful API through the HTTP protocol to send commands for retrieving, storing, deleting,..., the data.
An example of RESTful API is Amazon S3.
Object storage is accessed usually through a web GUI.
Access to object storage is also possible using NAS protocols - SMB/CIFS, NFS - through an interface - cloud gateway - that will translate NAS commands into "object" commands
Remark:
In general object storage lacks the file locking feature available to file storage.
Comments