Deployment Overview¶
The HPE GreenLake for File Storage Data Reduction Estimation Probe provides estimated data reduction rate achieable based on an example data set. Make sure to review the prerequisites to understand the hardware and software requirements to successfully run the probe. This article will guide you through the process of deployment and execution of the probe.
Download¶
Download using sftp to the Linux client that you wish to run the probe.
sftp gl4f_probe@halo.storagelr5.ext.hpe.com:/935553.probe.bundle.tar.gz .
Type in password when prompted: HPE@cc3$$4SFTP
If sftp port has changed from the default value '22', use the following command:
sftp -P 22 gl4f_probe@halo.storagelr5.ext.hpe.com:/935553.probe.bundle.tar.gz .
Expand & Verify Download¶
Now that you've downloaded the probe, you'll need to untar it and then verify the download is correct.
export PROBE_BUILD=935553
tar -xzf ${PROBE_BUILD}.probe.bundle.tar.gz
ls -l
Note: example may not show current build numbers.
[root@iris-centos-workloadclient-22 probe]# ls -l
total 1840344
-rw-r--r--. 1 root root 937920831 Jul 12 12:44 935553.probe.bundle.tar.gz
-rw-r--r--. 1 root root 946565338 Jul 12 12:44 935553.probe.image.gz
-rwxr-xr-x. 1 root root 19579 Jul 12 12:44 probe_launcher.py
Mount Filesystems Selected to Be Probed¶
Validated Filesystems Include, But Are Not Limited To:
- NFS
- Lustre
- GPFS
- S3 with goofys
- CIFS/SMB
For the most accurate results, do not use root-squash.
It's recommended to set read-only access on the mounted filesystem
Create Probe Directories¶
Change /mnt/ to the SSD-backed local disk to be used by the probe for the hash database and logging directories
sudo mkdir -p /mnt/probe/db
sudo mkdir -p /mnt/probe/out
sudo chmod -Rf 777 /mnt/probe
Size of the Data Set¶
- The input to the probe is a defined directory (
--input-dir
) - The probe will automatically query the input filesystem about space consumed and file count (inodes) and use that in its calculations
- Depending on the method of mounting and underlying storage, this can often provide an inaccurate query response
- It's highly recommended that manual estimated entries be defined for space consumed (
--data-size-gb
) and file count (--number-of-files
) - These estimates do not have to be accurate, round up reasonably
Running The Probe¶
The probe runs as a foreground application. This means that if your session is closed for whatever reason, the probe will stop. It's recommended running the probe as a screen session.
Here is an example of a command line. Edit the bold variables for the environment:
NOTE: Use underscores instead of spaces in COMPANY_NAME
and WORKLOAD
export DB_DIR=/mnt/probe/db
export OUTPUT_DIR=/mnt/probe/out
export INPUT_DIR=/mnt/filesystem_to_be_probed/sub_directory
export INPUT_SIZE_GB=10000
export QTY_FILES=1000000
export COMPANY_NAME=Your_Amazing_Company
export WORKLOAD=Describe_Your_Workload
Start the probe: (This may take up to five minutes to start displaying output)
sudo python3 ./probe_launcher.py \
--probe-image-path ${PROBE_BUILD}.probe.image.gz \
--input-dir $INPUT_DIR \
--metadata-dir $DB_DIR \
--output-dir $OUTPUT_DIR \
--data-size-gb $INPUT_SIZE_GB \
--number-of-files $QTY_FILES \
--customer-name ${COMPANY_NAME}---${WORKLOAD}
Example One: Small Data Sets To probe the directory interesting_data of 15 TB in-use and 5,000,000 files at the company ACME, the command would be:
sudo python3 ./probe_launcher.py \
--probe-image-path ${PROBE_BUILD}.probe.image.gz \
--input-dir /mnt/acme_filer/interesting_data \
--metadata-dir /mnt/data/probe/db \
--output-dir /mnt/data/probe/out \
--data-size-gb 15000 \
--number-of-files 5000000 \
--customer-name ACME---Interesting_Data
Example Two: Larger Data Sets To probe the directory fascinating_data of 60 TB in-use and 750,000,000 files at the company FOO, and are using defined parameters for RAM and SSD-backed local disk the command would be:
sudo python3 ./probe_launcher.py \
--probe-image-path ${PROBE_BUILD}.probe.image.gz \
--input-dir /mnt/foo_filer/fascinating_data \
--metadata-dir /mnt/data/probe/db \
--output-dir /mnt/data/probe/out \
--data-size-gb 60000 \
--number-of-files 750000000 \
--customer-name FOO---Facinating_Data
Example Three: Performance Throttling To probe the directory riviting_data of 250 TB in-use and 1,250,000,000 files at the company Initech, using defined parameters for RAM and SSD-backed local disk, but wish to have a lower performance impact on the filesystem, the command would be:
sudo python3 ./probe_launcher.py \
--probe-image-path ${PROBE_BUILD}.probe.image.gz \
--input-dir /mnt/initech_filer/riviting_data \
--metadata-dir /mnt/data/probe/db \
--output-dir /mnt/data/probe/out \
--data-size-gb 250000 \
--number-of-files 1250000000 \
--number-of-threads 4
--customer-name Initech---Riviting_Data
Note the --number-of-threads
flag. By default the probe will use all CPU cores in the system but this can be used to throttle performance and reduce potential impact of the scanned filesystem.
Other Probe Flags¶
While the probe is running and after completion, telemetry logs are automatically uploaded to HPE. To prevent this, add the following flag:
--dont-send-logs \
If you wish to send file names with the default telemetry logs, add the following flag:
--send-logs-with-file-names \
Probing filesystems which contain snapshots can often cause recursion issues and inaccurate results. As a result the probe automatically ignores directories named .snapshot. If your file system uses another convention, use the --regexp-filter
command. If for some reason you want the probe to read the .snapshot directories, specify false rather than true for --filter-snapshots
.
--filter-snapshots \ (this is the default)
Under most circumstances the probe should be run with adaptive chunking. However you can disable that feature by specifying this flag:
--disable-adaptive-chunking \
Understanding the Results¶
Once started, the probe will display the current projection of potential data reduction. Once completed, the probe will display output and is further described in Understanding Output
Re-Running The Probe¶
The hash database must be empty before running the probe again:
sudo rm -r /mnt/probe/db/*
Troubleshooting¶
Refer to the Troubleshooting document and contact HPE Support.