Saturday, April 19, 2025

DCNM login error "RequestSendFailed: EJBCLIENT000409" for the SAN Client

We have plan to build a new NDFC to manage our MDS switches.  It is more complex and will take some time to deploy.  Few challenges are resource requirement, license transfer and other new requirement.  In the meantime, we continue to use existing DCNM 11.5(4).  

Last week, suddenly, there is a login error with the java SAN client "RequestSendFailed: EJBCLIENT000409".  We have restarted the services and reboot the DCNM server.  However, it does not help.  





Do some research and most likely it is related to certificate.  We don't have our own certificate.  So, most likely, the default expired.  It is confirmed by checking the certificate expiration date from the web browser.  So, open a ticket with support and ask them to renew the certificate for 2 more years.  Then, I can login without issue.  

Thursday, April 17, 2025

Migration from VMAX 40k to PMAX 8000 and 8500

Just completed the storage migration from VMAX to PMAX 8000 and 8500 before Christmas.  NDM is not an option because NDM no longer supports Solaris.  Besides, there are a number of busy databases running in both test and production.  Before cutover to PMAX occurs, the SRDF directors between VMAX and PMAX will be the bottleneck.  Also, we do not have any extra director left to config for SRDF traffic.  

Our plan is to use Storage vMotion for all VMDK.  The RDMs are for SQL DB running in ESX.  For the RDM, we use SRDF/S to start replication to PMAX in the background.  Because there is no SRDF supported from VMAX to PMAX 8500, all luns migration using SRDF will be replicated to PMAX 8000.  Then, app owner will pick a downtime to shutdown the app and the servers.  We will stop the sync after confirming no outstanding tracks.  Remove the VMAX luns from the initiator group then add the PMAX luns.    

For Solaris, if it is Oracle DB, same size or larger luns will be added to Oracle.  Then DBA will complete the balancing and drop the old VMAX luns.  For the boot luns and luns from other app, some of our Unix admin will use SRDF/S to migrate them to PMAX 8000.  Some decide to do host base migration.  For host base migration, we just provide a lun of same size or larger to the Unix admin from PMAX 8500.  

Below summarizes the general steps for the storage migration from VMAX to PMAX 8000 using SRDF/S.  Pls check your environment and test to see if additional steps are required.  If Volume Manager is used, migration should be complete with Volume Manager rather than SRDF.  

1) Change the source luns attribute in VMAX to dyn_rdf.  
2) setup SRDF/S pair from VMAX to PowerMax 8000 (put the target lun in temp target_SG)
3) during downtime, shutdown the apps and servers  
4) Confirm no outstanding tracks.  
5)     Perform SRDF split.  
6) Remove source lun from VMAX storage group   
7) Add the target luns to SG in PowerMax and remove from temp target_SG.
8) Host team completes lun mapping 
9) Delete SRDF pair with force option 
10)  Unset GCM bit if required  (symdev -sid xxx -devs xxx unset -gcm)
11)  Host team can perform rescan if setp 10 is required.  (They should see about 1MB more space for luns in step 10)  
12)  Power up the servers to validate

That way, they can always go back to the VMAX luns if backout is required.  

Note: in the past, if the source / target luns are not mapped to FE ports, sometimes will see some strange results on some of the SRDF operation.  So, I create a temp target_SG with no HBA in the IG for the target_SG's masking view.  



Isilon NDMP backup failed after NetWorker upgraded to 19.11

We need to update firmware of existing Isilon to 9.7.1.x to add new Isilon nodes as recommended by support.  Before we do that, we need to confirm about NDMP backup with Isilon.  Because we have NDMP backup of Isilon by NetWorker, NetWorker upgrade to 19.11 from 19.10.0.4 is required.  After NetWorker upgrade in Feb, some of the NDMP backups failed with error message "Hostname resolution failed ".   Multiple retries will work.  However, it is very annoying.  

After working with support, there is new change to NetWorker in 19.11.  See kb NetWorker: server upgraded to 19.11, backup fails reporting "Hostname resolution failed" | Dell US

None of the workaround in the kb above will fix the issue on Isilon.  The forward DNS lookup for Isilon are actually forwarded to Smartconnect SIP from DNS server.  Only option is to add a reverse entries to the DNS server for all the Isilon nodes that handle NDMP backup.  Refer to link SmartConnect and Reverse DNS | Dell PowerScale: Network Design Considerations | Dell Technologies Info Hub.  

So, pointer records for all Isilon nodes handling NDMP backup will be created on the DNS server for the NDMP zone name.  Once that is done, all backup is fine.  Keep in mind there is no change on the forward lookup which is still handled by Smartconnect SIP.  Not sure if there is a fix now.  

Tuesday, April 15, 2025

Update Isilon to 9.7.1.4 from 9.4.0.14 to add new A300, H700 and F710 nodes

Main reason for the upgrade in Jan is to replace existing A200, H500 and F800 with new Isilon nodes A300, H700 and F710 nodes.  Support recommends to update firmware of existing nodes to 9.7.1.4 before adding new Isilon nodes to the pool.  

Things went smoothly during upgrade.  So far no issue.  Right after upgrade, I need to reconfigure the following settings again.  

1) SNMP node restriction got reset.  Manually select node 1-4 again (In our environment, we only allow SNMP trap from A200 nodes.  The A200 nodes handle only NDMP backup; the only VLAN for those A200 nodes)






2) SNMP v3 is new feature and selected auto in alert channel for SNMP.  Manually select SNMP v2 for our environment













3) SMTP setting is reset back to the manual settings (nothing required since it is populated with the same SMTP info)

New features

1) IPv6 feature is new and enabled by default

2) New feature for Transfer Limit at 90% for spillover pool.  I guess it will not fill it up if pool is 90% full and there is other pool with space less than 90%.

3) Support Assist is required for SCG in future OneFS.  

4)     Firewall