My VM, SAN and NetWorker blog: 2022

Friday, August 26, 2022

How to find HBA WWPN in Windows 2008, 2012 and 2016

Checking a lot of site and see others suggest to use fcinfo. For Windows 2008, you can use Storage Explorer. It will not only show you the WWPN on the servers, but also the zoneset if it is connected to Cisco FC switch.

For Windows 2012 and 2016, I see other option like powershell. However, there is a command built-in which is very convenient. Just run get-initiatorport and it will return the WWPN and WWNN.

=========================================================================

Update doc 2022 Aug to include Windows 2016

Saturday, August 20, 2022

Raspberry pi Desktop for Lenovo T400

Other than running Puppy Linux, I decided to try the Raspberry pi Desktop on Lenovo T400 laptop. Now, I can use my old laptop for browsing and checking email. Speed is acceptable. Still hard to believe my T400 is still working fine.

All the instructions can be found in https://projects.raspberrypi.org/en/projects/install-raspberry-pi-desktop

You use the same update command to update the OS periodically.

sudo apt update

sudo apt full-upgrade

I install 2 additional tools to check HDD status.

sudo app-get install smartmontools

sudo app-get install gsmartcontrol

All the hardware are discovered and functioning correctly including the old built in wireless card Intel 5300. Surprisingly, my old Target USB docking station ACP50US is working with Raspberry pi while there is no driver for Windows 7. At least, the ethernet port and the sound cards in ACP50US are working fine.

Monday, July 25, 2022

Isilon replication performance issue II

We have no issue with replication on those 2 big shares for about half a yr. Replication is setup to run every 4 hours. They normally complete within 20 min. Suddenly one day, I receive email alert on the target Isilon.

"PDM degraded, too many operations alert"

It will take longer and longer for the incremental replication to complete for the share with lots of files and deep folder structure. Other shares are not affected. Open a support ticket and confirm it is affected by bug 132337. Support provided the workaround to the issue. After applying the workaround, replication is back to normal.

Final fix should be on OneFS 9.2.1.13 / 9.1.0.20.

Sunday, July 24, 2022

Isilon replication performance issue I

We do have couple of shares required replication to target Isilon. They consist of hundreds of millions of files within the share, and one of them is setup with deep folder structure. We only setup limits on resource usage for replication but this is not good enough. CPU is setup 25%. Confirm only 10% can be consumed on the trunk for replication. Worker is set to 33% Max all the time as recommended by vendor during initial setup.

First, we made a mistake at the beginning and did not read through every word in the replication document.

If the source share contains lots of files and / deep directory structure, Domain Mark job can consume a lot of resources on the first sync. It is recommended to setup replication first and run initial sync with the box checked on “Prepare Policy for Accelerated Failback Performance” before migrating data to the share in the source.

However, we do the complete opposite. We migrate share from Veritas cluster to the source Isilon share. Then, setup replication and enable sync for first time. Once data copy completes and Domain Mark job starts, it starts to take away resource. The application using the share is very sensitive to performance. It will complain when it exceeds 25ms. Sometimes, share performance jumps to 40ms. We receive complaints from the application owner.

The workaround suggested by support is to Set vfs.vnlru_reuse_freevnodes to 1. This can be run on any nodes. We see significant decrease in resource usage after. (Pls check with support to confirm the cause before running any command)

# isi_sysctl_cluster vfs.vnlru_reuse_freevnodes=1

Once first sync and the first Domain Mark job completed successfully, we didn't have issue after.

SNMP for Isilon

Try to setup SNMP for Isilon but don't see much document. Hardware issue is monitored by vendor through eSRS and also alert us through SMTP.

We try to use SNMP to monitor the share usage for NFS. We run a few test to confirm the message Tivoli see when it exceeds the advisory and soft limit. Also, the message it will see when it continue violating the limit.

Main issue we discover is Tivoli does not receive the alert all the time. Later on, we find out it can send out SNMP from any of the node in the cluster. With our Isilon setup, we have at least 5 zones on different subnet. Because of firewall rules, not all of the subnet can be reached by SNMP manager. Because we have 3 types of nodes, Flash nodes (serving FE traffic), Hybrid nodes (serving some less critical application and replication) and Archive nodes (backup traffic only), we decide to use Archive nodes for monitoring. We confirm SNMP can reach backup subnet. Then, modify Isilon Alert section. Create a SNMP Monitoring channels.

Modify only Archive nodes can send out SNMP alerts (see below). In my setup, it is nodes 1 - 4. Now, SNMP alert will only be sent through archive nodes which has no issue reaching the SNMP manager.

CertUtil to verify MD5 and SHA256 checksum

When we download files from vendor, we use 3rd party tool to verify the checksum. With Windows 10, you can use CertUtil.

1) Open Command Prompt in Windows Desktop

2) Then, enter command CertUtil -hashfile "path to the file" hash-function-type

Below show the example of MD5 and SHA256. No need to download 3rd party tool to verify the checksum.

Thursday, April 21, 2022

Update firmware for MDS 9700 switches

You can find the steps at Cisco site to perform non-disruptive upgrade on MDS 9000 switches. I add some additional steps.

1) Read release notes and discuss with support to determine which version to upgrade. Then, download the correct firmware file. Supervisor 3 and supervisor 4 in MDS 9710 firmware files are different.

2) Confirm running config is applied to startup config

copy running-config startup-config

3) Copy running config to bootflash

copy running-config bootflash:$(SWITCHNAME)-$(TIMESTAMP).cfg

4) Backup config, existing kickstart and firmware to tftp. Other than startup config, I make sure I have a copy of the current version of kickstart and firmware in the TFTP server.

copy bootflash: tftp:

5) Save a copy of show tech-support detail. In case something goes wrong, support can compare the switch condition before and after upgrade.

6) confirm bootflash free space before uploading new firmware. I will delete firmware and kickstart file older than the current version.

dir bootflash:

dir bootflash://sup-standby/

7) upload image and kickstart files using tftp

copy tftp: bootflash:

8) Verify the MD5 checksum

show file bootflash:filename md5sum

9) Run the command below to see the impact of the upgrade.

show install all impact kickstart bootflash:kickstart.bin system bootflash:image.bin (check if upgrade is non-disruptive)

Run commands below for basic health check

show system redundancy status

show module

10) Confirm no TFTP / SFTP session (CSCvu52058)

In order to see if there are open file transfer sessions, you can run

show users | inc ssh | wc l

and

show processes | inc dcos_ssh | wc l

to compare the number of ssh users logged in and the number of sshd processes running.

If the values are different, you may have an open file transfer connection.

11) Run command below to see if the existing config incompatible with image.

show incompatibility system bootflash:image.bin

12) Run command below to start upgrade

install all kickstart bootflash:kickstart.bin system bootflash:image.bin

Once again, you see the impact of the upgrade.

It lists the current and target version of modules

Then, type y and enter to proceed the upgrade.

Wait for the switchover message

To show installation process. Run the command below.

show install all status

Once it shows "Install has been successful", run the command below to confirm new version of firmware is applied to the switch as expected.

show version

show module

Saturday, March 26, 2022

Isilon NDMP throttle

There are 3 shares in the Isilon with very complex folder structure. They are deep and contain lots of small files. Again, you need to follow tech spec guide. You cannot have more than 1 million files per folder. We have a NFS share with over 100,000,000 files but we have a folder created under the share each year.

Only issue is backup. When NDMP backup is run, it uses a lot of CPU when it reaches some folders. It can reach 80% CPU usage. Same thing when we use Datadobi to migrate to Isilon. When it reaches those folders, scan speed will drop by 50%.

I have Gen 6 nodes with OneFS 9.1. From the doc below, it requires Gen 6 nodes and the command is available in OneFS 8.2 (see link below). Looks like this setting can be modified thru CLI only. At least with 9.1, I don't see it in the GUI.

Isilon OneFS Help

1. Run the following command through the command line interface to enable NDMP Throttler:

isi ndmp settings global modify --enable-throttler true

2. View the setting by running the following command:

isi ndmp settings global view

3. Default settings are 50% for CPU throttle threshold. I use command below to change it to 65.

isi ndmp settings global modify –throttler-cpu-threshold 65

Isilon spare settings

Open SmartPools settings

Enable Global Spillover pool if you have more than 1 pool.

Make sure Virtual hot spare is selected. Our environment is small and use 1 virtual drive. For bigger environment, use % of total storage instead. EMC tech suggests 10% of total storage.

My VM, SAN and NetWorker blog