Monday, August 28, 2023

Isilon multiscan moves data between tiers unexpectedly

 After Isilon was implemented 2.5 yrs ago, we decided to stop SmartPool jobs and use combination of IndexUpdate and FilePolicy job to save resources on the Isilon.  We have 4 SSD nodes handling front end traffic, 4 hybrid nodes handling replication plus some test traffic and 4 SATA nodes handling the backup.  Initially, SATA nodes were deployed with insufficient memory and caused couple of outages during NDMP backup.  After max the memory on SATA nodes, there are no more issue after that.  

Because we want to keep as much data as possible on SSD, we won't run FilePolicy until SSD used space is over 70%.  After firmware update (nodes reboot), we saw data move between Tiers with multiscan kicking in auto.  That's about 2 yrs ago.  Checked with support and they said it's ok to kill it.  So, we just killed the Mutliscan job.  Recently, we finally had our first disk failed on the production Isilon and Multiscan was kicked in auto.  Same as before, we saw data moving unexpectedly to SATA tier.  I suspected if that's related to FilePolicy.  However, I was told it should not by support.  After case with support for a long time, we finally got the suggestion from higher level of support to run SmartPool job periodically.  So, I play with the DR Isilon since it is not busy.  Using IndexUpdate and FilePolicy job to move the data and compare the tier usage multiple times, each tier finally reached to the utilization I want 60-70% for SSD and below 60% for SAS pool (the spillover pool).  Then, kick off MultiScan job.  Now, I don't see any more data move between tiers with MultiScan job.  







In the future, I will adjust the FilePolicy each month and then kick off the SmartPool job once a month just in case Multiscan is started due to failed HDD or node replacement.




===========================================================================

Update Sep 25, 2023

For Production Isilon, after IndexUpdate completes, I run FilePolicy then Multiscan job.  I still see data moves to SATA pool.  So, I run IndexUpdate -> FilePolicy -> SmartPool job.  Same things.  Looks like there is discrepancy between IndexUpdate + FilePolicy and SmartPool job.  

This time, I just run SmartPool job and kill it once SATA pool reaches 85%.  Then adjust the FilePolicy and rerun it SmartPool job.  After 3 tries, I finally see the results I want.  Why there is a discrepancy between FilePolicy and SmartPool job, I have no idea.  

From now on, I will adjust the FilePolicy once every 2 months and run the SmartPool job.  For DR Isilon, since it has far less data, I will adjust FilePolicy once every 3-4 month and then run SmartPool job. 

No comments: