Monday, August 28, 2023

Isilon multiscan moves data between tiers unexpectedly

 After Isilon was implemented 2.5 yrs ago, we decided to stop SmartPool jobs and use combination of IndexUpdate and FilePolicy job to save resources on the Isilon.  We have 4 SSD nodes handling front end traffic, 4 hybrid nodes handling replication plus some test traffic and 4 SATA nodes handling the backup.  Initially, SATA nodes were deployed with insufficient memory and caused couple of outages during NDMP backup.  After max the memory on SATA nodes, there are no more issue after that.  

Because we want to keep as much data as possible on SSD, we won't run FilePolicy until SSD used space is over 70%.  After firmware update (nodes reboot), we saw data move between Tiers with multiscan kicking in auto.  That's about 2 yrs ago.  Checked with support and they said it's ok to kill it.  So, we just killed the Mutliscan job.  Recently, we finally had our first disk failed on the production Isilon and Multiscan was kicked in auto.  Same as before, we saw data moving unexpectedly to SATA tier.  I suspected if that's related to FilePolicy.  However, I was told it should not by support.  After case with support for a long time, we finally got the suggestion from higher level of support to run SmartPool job periodically.  So, I play with the DR Isilon since it is not busy.  Using IndexUpdate and FilePolicy job to move the data and compare the tier usage multiple times, each tier finally reached to the utilization I want 60-70% for SSD and below 60% for SAS pool (the spillover pool).  Then, kick off MultiScan job.  Now, I don't see any more data move between tiers with MultiScan job.  







In the future, I will adjust the FilePolicy each month and then kick off the SmartPool job once a month just in case Multiscan is started due to failed HDD or node replacement.




===========================================================================

Update Sep 25, 2023

For Production Isilon, after IndexUpdate completes, I run FilePolicy then Multiscan job.  I still see data moves to SATA pool.  So, I run IndexUpdate -> FilePolicy -> SmartPool job.  Same things.  Looks like there is discrepancy between IndexUpdate + FilePolicy and SmartPool job.  

This time, I just run SmartPool job and kill it once SATA pool reaches 85%.  Then adjust the FilePolicy and rerun it SmartPool job.  After 3 tries, I finally see the results I want.  Why there is a discrepancy between FilePolicy and SmartPool job, I have no idea.  

From now on, I will adjust the FilePolicy once every 2 months and run the SmartPool job.  For DR Isilon, since it has far less data, I will adjust FilePolicy once every 3-4 month and then run SmartPool job. 

Wednesday, August 16, 2023

Cable Modem TC4400 overheated?

Switch to TC4400 cable modem for a few months because DOCSIS 3.0 will not be supported by my cable company.  This is the only one not using Puma chipset and supported by my cable company.  It worked fine until recently.  The internet connection drops few times day.  When I touch it, the modem is really hot.  The quick fix is to put a little 15mm by 15mm fan on the top of modem to draw the heat away from it.  Now, it is much better and has not experienced any more connection drop.  



Friday, August 11, 2023

How long to quick format a lun in Windows

 I always get that question for the SQL or file cluster RDM disks.  Windows admin and DBA thought the format hung but it indeed took long time to quick format a lun.  Last week, Windows admin complaint it took approx an hour to quick format a 7TB lun in EMC PMAX for a Windows 2016 / 2019 VM.  

I just completed a test this morning.  For a physical Windows 2016 server, it took about 10-15 min approx to quick format a Windows 1 TB lun in PMAX.  Do some research in the internet.  The result provided by partitionwizard.com suggests it indeed takes long time to format a big lun (see below also).  

  • How Long Does It Take to Format a 1TB Hard Drive: Performing a Quick Format on a 1TB hard drive takes about 20 minutes. If you select the Full Format, it could take you up to 1 hour.
  • How Long Does It Take to Format a 2TB Hard Drive: Again, we perform a Quick Format on a 2TB hard drive, it can be done in about 30 minutes. However, a Full Format can take up to 3 hours. If this hard drive stores a chunk of data, it could take you a half day.
  • How Long Does It Take to Format a 4TB Hard Drive: To a certain degree, 4TB is a large hard disk that will take quite a long time to format. So, you’d better select a Quick Format if you want to save time. This is because fully formatting a 4TB hard drive can take you a whole day and even more
This give you an idea even quick format in Windows will still take some time.  

Our Wintel team will test the format time following EMC article 000062689 on the PMAX.
The cause is "trim and unmap" feature is on.  So, just temp turn it off before formatting new lun and enable it back once done.  

On the windows host, disable the SCSI TRIM and Unmap feature for the duration of the format. Use fsutil command from the command line

1) To verify the current setting, using a Windows CMD window on the Host, run:   

fsutil behavior query DisableDeleteNotify
DisableDeleteNotify=0 -indicates the 'Trim and Unmap' feature is on (enabled)
DisableDeleteNotify=1 -indicates the 'Trim and Unmap' feature is off (disabled)

2) To disable, issue the command:   

fsutil behavior set DisableDeleteNotify 1

3) Once formatting is complete, re-enable the feature using command:   

fsutil behavior set DisableDeleteNotify 0

It may impact Linux as well.  See thread mkfs is extremely slow

To run mkfs without trim, use the -K option on XFS and -E nodiscard on ext4

XFS

mkfs.xfs -K /dev/sdx 

EXT4

mkfs.ext4 -E nodiscard 

Warning: Only use -K or -E on new volumes with no existing data.

Using the -K or -E options on drives with existing data, will cause the space to be wasted until the data is overwritten.