Burhan Halilov

Question: Is this "Recommended Multi-Path Settings " KB article really a good  practice for VMware multipathing ?

Discussion created by Burhan Halilov on May 7, 2017
Latest reply on Oct 19, 2017 by Erwin van Londen

This KB article

 

https://knowledge.hds.com/Knowledge/Storage/Reccomended_Multi-Path_settings_for_HDS_Storage

 

States:

 

1. Each HBA should only "see" one instance of each LUN.  This means if you have 2 HBA's, then you should only have 2 paths to each LUN. If you have 4 HBA's, then you can provide 4 paths to each LUN. Allowing an HBA to see more than one instance of any LUN can elongate path recovery, cause performance issues,  and cause delayed error recovery, and even host outagesHDS does not recommend having an HBA see more than one instance of a LUN.

 

The configuration:
number of paths <= number of HBA's
...optimizes performance, reliability, and recovery.

2. Two to four paths to each LUN provides optimal performance for most workloads.  While there is no limitation to the number of paths one can configure to each LUN, two to four has been demonstarted to provide the best performance. Performance-wise, configuring more than four paths to each LUN will result in significantly diminishing returns.

3. "Single Target / Single Initiator Zoning" is the recommended for Brocade and Cisco-based fabrics.

 

Here are my concerns :

  • If this KB  is a best practices recommendation none of the following documents has any mention of it :
    • Host Attachment Guide
    • G1000 Performance guide
    • Provisioning Guide for Open Systems
    • Optimize Hitachi Storage and Server Platforms in VMware vSphere 5.5 Environments Best Practices Guide
    • HDLM User's guide
  • There is no date on it - It could have been a best practice for USPV or earlier from many years ago
  • No technical explanation of what happens and why if it’s not followed
  • HSC  does not issue a warning if 1 HBA can see a LDEV trough more than 1 paths, nor it prevents users to provision without following this “recommendation”
  • HSC Health reports do not detect if the “Best Practice” is not followed nor they identify performance issues that may be caused by not following it.
  • Most of the HDS people I spoke to are not aware of this KB, but when Performance case is opened, support uses it to blame the user.
  • This "best practice" becomes a huge burden to follow if any automation is needed when adding hosts or LDEVs, and if more than 4 FA ports are needed (8 is not unusual for clusters with 30+ hosts) scenarios become exponentially more complex.

 

 

 

In case of vSphere cluster, and according tho the KB the following configuration is recommended/supported :

 

 

 

But the following is NOT - since it doesn't  satisfy the "one LDEV one path from any one HBA" rule - we have 2 Paths to each LDEV from every HBA :

 

 

 

 

Dealing with vSphere clusters with 10+ hosts and hundreds of TBs of data, the first configuration does  not scale.  If we want 4 or more FA ports per cluster there are two scenarios we can use :

(assuming 2 FAs for simplicity, if we use 4 ports in 4 separate FAs limitations would be the same)

  • Segregate Servers in groups :

 

  • Segregate LDEVs in groups :

 

 

 

Neither one of these scenarios is flexible nor well (and self)  balanced. In both scenarios each FA port only has 1 other failover FA port, so we need to keep the utilization <%50 for all to prevent contention in case of port failure. Manual load balancing is needed if any of the ports gets hot. In first scenario zoning is inconsistent , in the second the host groups are inconsistent.

 

 

Are you following this "Recommendation" ?

Outcomes