RAID Interview Question & Answer

1-what is redundancy ?

Writing to two or more disks at the same time. Having the same data stored on separate disks enables the data to be recovered in the event of a disk failure without resorting to expensive data recovery techniques

2-what is disk mirroring ?

a simple computer, data is stored on a single hard disk. Disk mirroring stores the same data to two separate disks at once. If one disk fails, all the information is still available on the other disk.

3-what is disk block ?

A block is a set of bits or bytes that forms an identifiable unit of data. The term is used in database management, word processing, and network communication

4-what is disk striping ?

In computers that use multiple hard disk systems, disk striping is the process of dividing a body of data into blocks and spreading the data blocks across several partitions on several hard disks. Each stripe is the size of the smallest partition. For example, if three partitions are selected with one partition equaling 150megabytes, another 100MB, and the third 50MB, each stripe will be 50 MB in size. It is wise to create the partitions equal in size to prevent wasting disk space. Each stripe created is part of the stripe set. Disk striping is used with redundant array of independent disks (RAID). RAID is a storage system that uses multiple disks to store and distribute data. Up to 32 hard disks can be used with disk striping.
There are two types of disk striping: single user and multi-user. Single user disk striping allows multiple hard disks to simultaneously service multiple I/O requests from a single workstation. Multi-user disk striping allows multiple I/O requests from several workstations to be sent to multiple hard disks. This means that while one hard disk is servicing a request from a workstation, another hard disk is handling a separate request from a different workstation.
5-what is parity in raid ?
Parity computations are used in RAID drive arrays for fault tolerance by calculating the data in two drives and storing the results on a third. The parity is computed by XOR’ing a bit from drive 1 with a bit from drive 2 and storing the result on drive 3 (to learn about XOR, see OR). After a failed drive is replaced, the RAID controller rebuilds the lost data from the other two drives. RAID systems often have a “hot” spare drive ready and waiting to replace a drive that fails. See

6-what is fault tolerance ?

Fault tolerance is the ability of a system to continue working even when a fault exists. In the case of RAID, which stands for Redundant Array of Inexpensive Discs, fault tolerance is provided by
having data recorded on more than one drive, and also by having more than one power supply. Note that RAID 0 is not fault telerant because it is simply stripes the data to increase size and bandwidth,
but provides no redundancy. RAID 1 and RAID 5 are fault tolerant, to various levels.

7-what is disk array ?

A disk array is a hardware element that contains a large group of hard disk drives (HDDs). It may contain several disk drive trays and has an architecture which improves speed and increases data protection. The system is run via a storage controller, which coordinates activity within the unit. Disk arrays form the backbone of modern storage networking environments. A storage area network (SAN) contains one or more disk arrays that function as the repository for the data which is moved in and out of the SAN.
8-what is disk SAN ?

A Storage area network, or SAN, is a high-speed network of storage devices that also connects those storage devices with servers. It provides block-level storage that can be accessed by the applications running on any networked servers. SAN storage devices can include tape libraries, and, more commonly, disk-based devices, like RAID hardware.

9-what is disk RAID ?

Originally, the term RAID stood for “redundant array of inexpensivedisks,” but now it usually refers to a “redundant array ofindependent disks.” While older storage devices used only one disk drive to store data, RAID storage uses multiple disks in order to provide fault tolerance, to improve overall performance, and to increase storage capacity in a system.
With RAID technology, data can be mirrored on one or more other disks in the same array, so that if one disk fails, the data is preserved. Thanks to a technique known as “striping,” RAID also offers the option of reading or writing to more than one disk at the same time in order to improve performance. In this arrangement, sequential data is broken into segments which are sent to the various disks in the array, speeding up throughput. Also, because a RAID array uses multiple disks that appear to be a single device, it can often provide more storage capacity than a single disk.

10-What is the difference between hardware RAID and Software RAID?

The hardware-based RAID is independent from the host. A Hardware RAID device connects to the SCSI controller and presents the RAID arrays as a single SCSI drive. An external RAID system moves all RAID handling “intelligence” into a controller located in the external disk subsystem. The whole subsystem is connected to the host via a normal SCSI controller and appears to the host as a single disk.
Software RAID is implemented under OS Kernel level. The Linux kernel contains an MD driver that allows the RAID solution to be completely hardware independent. The performance of a software-based array depends on the server CPU performance and load.


11-What are the commonly used RAID types?
RAID 0
RAID 1
RAID 5

12-Explain RAID 0?
RAID level 0 works on “striping” technique. In RAID 0 the array is broken down into strips and data is written into strips. RAID 0 allows high I/O performance but provides no redundancy. RAID 0 Array Size is equal to sum of disks in array. If one drive fails then all data in the array is lost.

13-Explain RAID 1?
RAID Level 1 is based on Mirroring technique. Level 1 provides redundancy by writing identical data to each member disk of the array. The storage capacity of the level 1 array is equal to the capacity of one of the mirrored hard disks in a Hardware RAID or one of the mirrored partitions in a Software RAID. RAID 1 provides redundancy means good protection against disk failure. In RAID 1 write speed is slow but read speed is good.

14-Explain RAID 5?

RAID Level 5 is based on rotating parity with striping technique. RAID-5 stores parity information but not redundant data (but parity information can be used to reconstruct data). The storage capacity of Software RAID level 5 is equal to the capacity of the member partitions, minus the size of one of the partitions if they are of equal size. The performance of RAID 5 is based on parity calculation process but with modern CPUs that usually is not a very big problem. In RAID 5 read and write speeds are good.


15-Which kernel module is required for Software RAID?
“md” module

16-which utility or command is used for creating software RAID’s for RHEL5?

mdadm


17-Can we create software RAID during Linux installation?

Yes, we can create Software RAID during Linux Installation by “Disk Druid”

18-What is the role of chunk size for software RAID?
Chunk size is very important parameter on which RAID performance based.
We know stripes go across disk drives. But how big are the pieces of the stripe on each disk? The pieces a stripe is broken into are called chunks.To get good performance you must have a reasonable chunk size.
For big I/Os we required small chunks and for small I/Os we required big chunks.

19-What is SWAP Space?
Swap space in Linux is used when the amount of physical memory (RAM) is full. If the system needs more memory resources and the RAM is full, inactive pages in memory are moved to the swap space. While swap space can help machines with a small amount of RAM, it should not be considered a replacement for more RAM. Swap space is located on hard drives, which have a slower access time than physical memory.
20-What are the steps to create SWAP files or Partition?

– Create swap partition or file
– Write special signature using “mkswap”
– Activate swap space by “swapon –a” command
– Add swap entry into /etc/fstab file

21-How you will create swap file of size 4 GB and explain swap file entry in /etc/fstab file?

Use “dd” command to create swap file.
dd if=/dev/zero  of=/SWAPFILE  bs=1024  count=4
mkswap /SWAPFILE
swapon –a
Entry into /etc/fstab file.
/SWAPFILE   swap   swap   defaults   0   0

22-What are the advantages and disadvantage of RAID?

Advantages:
  • RAID allows form of backup of the data in the storage array in the event of a failure
  • In event failure, if one of the drives fails then either drive swapped out for a new drive without turning the systems off also known as hot swappable
  • OR the redundant drive could be used
  • Ensures data reliability, increase in Input Output performance and shadowing/Mirroring at a lower cost
  • Increase the parity check and regularly checks for any possible system crash
  • Provides Disk Stripping to write data to disk, which improves performance by the interleaving of the bytes or the group of bytes.
  • Disk Stripping make multiple smaller disks look like one large disk
  • Reading and Writing of data done at simultaneously.
  • Mirroring for 100% duplication of data on two drives.
  • Mirroring offers parity check to ensure data from crashed system be matched with the data stored on to the other disk.
Disadvantages:
  • RAID doesn’t make data recovery any easier.
  • RAID cannot completely protect your data.
  • RAID doesn’t always result in improved system performance.
  • Costly, must purchase and maintain RAID the controllers and dedicated hard drives
  • Have to be maintained by highly paid consultants.
  • They may slower the system performance if not used properly
  • RAID is not data protection, but to increase access speed
  • If your data is not being backed up offsite, security is still concerned
  • more work to do like installing drivers, updating firmware and running consistency checks

23-Whats the difference between RAID0 & RAID1 ?



RAID 0

RAID 1

Striping
Yes; data is striped (or split) evenly across all disks in the RAID 0 setup.
No; data is fully stored on each disk.
Mirroring, redundancy and fault tolerance
No
Yes
Performance
In theory RAID 0 offers faster read and write speeds compared with RAID 1.
RAID 1 offers slower write speeds but could offer the same read performance as RAID 0 if the RAID controller uses multiplexing to read data from disks.
Applications
Where data reliability is less of a concern and speed is important.
Where data loss is unaceptable e.g. Data archival

24-How many minimum disk drives are needed for R0,R1,R5,R10,R01?

R0: Minimum 1
R1: Minimum 2
R5: Minimum 3
R10: Minimum 4
R01: Minimum 4

25-Whats the difference between RAID3 & RAID5 ?

RAID 3 and RAID 4: Striped Set (3 disk minimum) with Dedicated Parity, the parity bits represent a memory location each, they have a value of 0 or 1, whether the given memory location is empty or full, thus enhancing the speed of read and write. : Provides improved performance and fault tolerance similar to RAID 5, but with a dedicated parity disk rather than rotated parity stripes. The single disk is a bottle-neck for writing since every write requires updating the parity data. One minor benefit is the dedicated parity disk allows the parity drive to fail and operation will continue without parity or performance penalty.

RAID 5 does not have a dedicated parity drive but the parity is rotated across all the drives hence the parity is distributed.
RAID 5: Striped Set (3 disk minimum) with Distributed Parity: Distributed parity requires all but one drive to be present to operate; drive failure requires replacement, but the array is not destroyed by a single drive failure. Upon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user. The array will have data loss in the event of a second drive failure and is vulnerable until the data that was on the failed drive is rebuilt onto a replacement drive.

26-Whats the difference between RAID01 & RAID10 ?

RAID 0+1: Striped Set + Mirrored Set (4 disk minimum; Even number of disks) provides fault tolerance and improved performance but increases complexity. Array continues to operate with one failed drive. The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set to mirror a primary striped set, and as a result can only sustain a maximum of a single disk loss, whereas 1+0 can sustain multiple drive losses as long as no two drive loss comprise a single pair.

RAID 1+0: Mirrored Set + Striped Set (4 disk minimum; Even number of disks) provides fault tolerance and improved performance but increases complexity. Array continues to operate with one or more failed drives. The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives.
27-How RAID 5 works and how parity is calculated ?

The parity calculation is typically performed using a logical operation called “exclusive OR” or “XOR”. As you may know, the “OR” logical operator is “true” (1) if either of its operands is true, and false (0) if neither is true. The exclusive OR operator is “true” if and only if one of its operands is true; it differs from “OR” in that if both operands are true, “XOR” is false.


28-Other than RAID feature what are the other features in Software
Management Functionalities?

Hotspare
Raid level migration (RLM)
SNMP interaction/management

29-What is initialization ?

Intialization is the process of preparing a drive for storage use. It erases all data on the drive & makes way for new file system creation.

30-What is Check consistency ?

Consistency check or CC verifies correctness of data in logical drives. This is a feature of some of the RAID hardware controller cards.

31-What is background initialization?

This is a Consistency check process forced when a new logical drive is created. This is an automatic operation that starts 5 minutes after the new logical drive is created.

32-What is a RAID array ?

RAID array is a group of disks which are configured with RAID. That means they are in a redundant setup to tolerate any disk failures.

33-Whats the difference between a JBOD & a RAID array ?

Just A Bunch Of Disks (JBOD) – hard disks that aren’t configured in a RAID configuration. They are just disks piled or connected in one single enclosure.

RAID is having the advantage of bearing a disk failure & still give data availability.


34-When JBOD is preferred over RAID array ?

When there is no need for redundancy & when it is ok if there is some hard disk failure or data unavilability in such scenarios JBOD is prefered over RAID because JBOD is inexpensive storage solution. It is also easy to setup & start using compared to RAID.

35-What is a hot spare ?

Hot spare is an extra,unused disk drive that is part of the disk subsystem. It is usually in standby mode ready for service if a drive fails. Whenever there is a drive failure this hotspare kicksin & takes over that failed drive’s role.

36-What is a Logical drive or Virtual drive ?

The partitioning or division of a large hard drive into smaller units. A single, large Physical Drive can be partitioned into two or more smaller Logical Drives.

37-What is rebuilding of array ?

Whenever there is a disk failure in the RAID array the array goes to DOWNGRADED STATE. SO when we plug out the failed drive & insert a new functioning drive the RAID configured array starts regenerating the data to the newer drive. This process is called rebuilding.

38-What you do when a drive in an array fails, how you bring it back to optimal online mode ?

We swap out failed drive & plugin new functioning drive & wait for the rebuilding process to complete. We make sure rebuild process happens without any error. Once that completes array is back to optimal online state.

39-What are the different states an array can be in and explain each state?

Online
Downgraded
Offline
Rebuilding

40-Explain Online,Offline,Degraded states of an array ?


Online – when all drives are working fine
Downgraded – Whenever there is a drive failure but still the array is functioning fine
Offline – Array or whole data storage is down
Rebuilding – Storage access is there but since a new drive has been inserted in place of a failed drive data is being written to new drive which might slow down the performance of the whole RAID array.

41-What is the difference between a global hotspare & a dedicated hotspare ?

Global hotpsare is available for the any  array in the whole enclosure or Storage subsystem.

If there is an enclosure having 10 drives & we have 3 drives in RAID5(1st array) , 3 more drives in second RAID5(2nd array) & 2 more drives in RAID 1 config.We can specify in RAID config utility whether a Dedicated hotspare is assigned for 1st RAID5 array. If there is a drive failure in 2nd or 3rd array this dedicated hotspare will not be involved there. But if the array for which this is dedicated has any drive failure this dedicated hotspare takes over .

42-How RAID is configured through BIOS ?

If we have a Hardware RAID controller card it gives an option while machine booting to enter into RAID BIOS utility. Here we have options which give us options to create RAID using a semi-GUI(DOS based GUI) interface.

43-HoW RAID is configured in OS level?

Once we install device drivers & also RAID config or management utility using that we can configure RAID in OS level.

44-What is the difference between a software RAID & hardware RAID ?

In order for RAID to function, there needs to be software either through the operating system or via dedicated hardware to properly handle the flow of data from the computer system to the drive array. This is particularly important when it comes to RAID 5 due to the large amount of computing required to generate the parity calculations.

In the case of software implementations, CPU cycles are taken away from the general computing environment to perform the necessary tasks for the RAID interface. Software implementations are very low cost monetarily because all that is necessary to implement one is the hard drives. The problem with software RAID implementations is the performance drop of the system. In general, this performance hit can be anywhere from 5% or even greater depending upon the processor, memory, drives used and the level of RAID implemented. Most people do not use software RAID anymore due to the decreasing costs of hardware RAID controllers over the years.

Hardware RAID has the advantage of dedicated circuitry to handle all the RAID drive array calculations outside of the processor. This provides excellent performance for the storage array. The drawbacks to hardware RAID have been the costs. In the case of RAID 0/1 controllers, those costs have become so low that many chipset and motherboard manufacturers are including these capabilities on the motherboards. The real costs rest with RAID 5 hardware that require more circuitry for added computing ability.

45-Which is best RAID level for performance and which is best for redundancy?

RAID 0 for performance
RAID 5 or RAID 6 better for redundancy(availibility)

Leave a Reply

Your email address will not be published. Required fields are marked *