Storage 101 : Object storage and Block storage



Bock storage:

Bock storage manages data in blocks that are accessed using the iSCSI protocol for example.
The end user accesses block storage over the network as a normal local disk on the machine

Access to the data is done directly, not through a filesystem resulting in a better performance
Tasks that requires computation like handling locks, shared access,.., is managed by the client.
A block has an address but it lacks the metadata we find in file storage for example - type of data, owner, rights, ... -

It is not used for backup and archiving but it is used mostly for application that need a high performance in terms of reading and writing operations such as databases.
We can expand a block storage by adding more disks or more machines, but they can't span big geographical area since that may harm performance

Object Storage:


Object storage
stores data as objects, they don't have a hierarchy and these objects are stored in containers called buckets.
Buckets as opposed to blocks in block storage can live anywhere, there are no constraints on distance as it is the case with block storage.

Object has the below attributes:
  • data : content of the objects
  • unique ID : identifier that removes the need for having paths as it is the case for block data.
  • metadata : is a description of the content of the objects that could be used for indexing, sorting, ...
Multiple copies of a bucket can live in multiple nodes in different locations removing the need to have a dedicated backup strategy.

We can simply use the metadata to add information about replication to tag the buckets as "replicable to location X" or "in need of replication X times" for example.

The UID is used as a "filename" to search for and locate objects.
Another feature other than the replication, is erasure coding.
Erasure coding means splitting a big object into smaller units.

With code erasure parts of an object could live in different machines with the assurance that if one machine fails that data can be reconstructed.

Object storage has also the below protection characteristics:
  • Versioning: where older objects are saved if changes are applied to them, so we could restore our old data if need be.
  • Concurrent access: multiple versions are created for each "writing" user or application.
Object storage is not ideal for data that is changed often by different users like Excel sheets for example.
If is normally used as a backup storage where there is no urgent need for high performance.

Object storage uses RESTful API through the HTTP protocol to send commands for retrieving, storing, deleting,..., the data.

An example of RESTful API is Amazon S3.

Object storage is accessed usually through a web GUI.

Access to object storage is also possible using NAS protocols - SMB/CIFS, NFS - through an interface - cloud gateway - that will translate NAS commands into "object" commands

Remark:

In general object storage lacks the file locking feature available to file storage.

Comments

Leave as a comment:

Archive