Skip to content

Commit aed5848

Browse files
author
Oliver Cervera
authored
Merge pull request #3 from auanasgheps/dev
Merge code for 2.8 release
2 parents 77f7be8 + 1b99f87 commit aed5848

File tree

2 files changed

+87
-62
lines changed

2 files changed

+87
-62
lines changed

README.md

Lines changed: 35 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -16,22 +16,41 @@ _This readme has some rough edges which will be smoothened over time._
1616
# Highlights
1717

1818
## How it works
19-
- After some preliminary checks, the script will execute `snapraid diff` to figure out if parity info is out of date, which means checking for changes since the last execution.
19+
- After some preliminary checks, the script will execute `snapraid diff` to figure out if parity info is out of date, which means checking for changes since the last execution. During this step, the script will ensure drives are fine by reading parity and content files.
2020
- One of the following will happen:
2121
- If parity info is out of sync **and** the number of deleted or changed files exceed the threshold you have configured it **stops**. You may want to take a look to the output log.
22-
- If parity info is out of sync **and** the number of deleted or changed files exceed the threshold, you can still **force a sync** after a number of warnings. It's useful If you often get a false alarm but you're confident enough.
23-
- If parity info is out of sync **but** the number of deleted or changed files did not exceed the treshold, it **executes a sync** to update the parity info.
24-
- When the parity info is in sync, either because nothing has changed or after a successfully sync, it runs the `snapraid scrub` command to validate the integrity of the data, both the files and the parity info. _Note that each run of the scrub command will validate only a configurable portion of parity info to avoid having a long running job and affecting the performance of the server._
22+
- If parity info is out of sync **and** the number of deleted or changed files exceed the threshold, you can still **force a sync** after a number of warnings. It's useful If you often get a false alarm but you're confident enough. This is called "Sync with threshold warnings"
23+
- If parity info is out of sync **but** the number of deleted or changed files did not exceed the threshold, it **executes a sync** to update the parity info.
24+
- When the parity info is in sync, either because nothing has changed or after a successfully sync, it runs the `snapraid scrub` command to validate the integrity of the data, both the files and the parity info. If sync was cancelled or other issues were found, scrub will not be run. _Note that each run of the scrub command will validate only a configurable portion of parity info to avoid having a long running job and affecting the performance of the server._
25+
- Extra information is be added, like SnapRAID's disk health report.
2526
- When the script is done sends an email with the results, both in case of error or success.
2627

27-
Pre-hashing is enabled by default to avoid silent read errors. It mitigates the lack of ECC memory.
28+
## Customization
29+
Many options can be changed to your taste, their behavior is documented in the script config file.
30+
If you don't know what to do, I recommend using the default values and see how it performs.
31+
32+
### Customizable features
33+
- Sync options
34+
- Sync always (forced sync)
35+
- Sync after a number of breached threshold warnings
36+
- Sync only if thresholds warnings are not breached (enabled by default)
37+
- Thresholds for deleted and updated files
38+
- Scrub options
39+
- Enable or disable scrub
40+
- Data to be scrubbed - by default 5% older than 10 days
41+
- Pre-hashing - enabled by default to avoid silent read errors. It mitigates the lack of ECC memory.
42+
- SMART Log - enabled by default, a SnapRAID report for disks health status
43+
- Verbosity - disabled by default, does not include the TOUCH and DIFF output to have a better email
44+
- Spindown - to spindown drives after the script, disabled because is currently not working
45+
- Snapraid Status - show the status of the array, disabled because the report output is not rendered correctly
46+
47+
48+
You can also change more advanced options such as mail binary (by default uses `mailx`), SnapRAID binary location, log file location.
2849

2950
## A nice email report
3051
This report produces emails that don't contain a list of changed files to improve clarity.
3152

32-
You can re-enable full output in the email by switching the option `VERBOSITY` but the full report will always be available in `/tmp/snapRAID.out` and will be replaced after each run or deleted when the system is shut down if kept there.
33-
34-
SMART drive report from SnapRAID is also included by default.
53+
You can re-enable full output in the email by switching the option `VERBOSITY` but the full report will always be available in `/tmp/snapRAID.out` but will be replaced after each run, or deleted when the system is shut down. You can change the location of the file, if needed.
3554

3655
Here's a sneak peek of the email report.
3756

@@ -66,8 +85,9 @@ DIFF finished [Sat Jan 9 02:07:46 CET 2021]
6685

6786
**SUMMARY of changes - Added [2] - Deleted [0] - Moved [0] - Copied [0] - Updated [0]**
6887

69-
There are deleted files. The number of deleted files, (0), is below the threshold of (2). SYNC Authorized.
70-
There are updated files. The number of updated files, (0), is below the threshold of (2). SYNC Authorized.
88+
There are no deleted files, that's fine.
89+
There are no updated files, that's fine.
90+
SYNC is authorized.
7191

7292
### SnapRAID SYNC [Sat Jan 9 02:07:46 CET 2021]
7393

@@ -157,16 +177,8 @@ All jobs ended. [Sat Jan 9 02:07:49 CET 2021]
157177
Email address is set. Sending email report to example@example.com [Sat Jan 9 02:07:49 CET 2021]
158178
```
159179

160-
## Customization
161-
Many options can be changed to your taste, their behaviour is documented in the script config file.
162-
163-
If you don't know what to do, I recommend using the default values and see how it performs.
164-
165-
You can also change more advanced options such as mail binary (by default uses `mailx`), SnapRAID binary location, log file location.
166-
167-
168180
# Requirements
169-
- Markdown to have nice emails
181+
- Markdown to have nice emails - will be installed if not found
170182
- ~~Hd-idle to spin down disks - [Link TBD] - currently not required since spin down does not work properly.~~
171183

172184
# Installation
@@ -179,6 +191,10 @@ If you want to use this script on OMV, don't worry about the section _Diff Scrip
179191
5. Tweak the config file if needed
180192
6. Schedule the script execution time
181193

194+
It is tested on OMV5, but will work on other distros. In such case you may have to change the mail binary or SnapRAID location.
195+
196+
If you want to use this script on OMV, don't worry about the section _Diff Script Settings_ in the main page of the SnapRAID plugin, since it only applies to the built-in plugin script. Also don't forget to remove from scheduling the built-in script.
197+
182198
# Known Issues
183199
- Hard disk spin down does not work: they are immediately woken up. The script probably does not handle this correctly while running.
184200
- The report is not perfect, we can't be solve this because SnapRAID does not natively support Markdown.

snapraid-aio-script.sh

Lines changed: 52 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,12 @@
33
#
44
# Project page: https://github.com/auanasgheps/snapraid-aio-script
55
#
6-
SNAPSCRIPTVERSION="2.7"
76
########################################################################
87

98
######################
10-
# USER VARIABLES #
9+
# CONFIG VARIABLES #
1110
######################
11+
SNAPSCRIPTVERSION="2.8"
1212

1313
# find the current path
1414
CURRENT_DIR="$(dirname "${0}")"
@@ -129,6 +129,7 @@ function main(){
129129

130130
# Now run sync if conditions are met
131131
if [ $DO_SYNC -eq 1 ]; then
132+
echo "SYNC is authorized. [`date`]"
132133
echo "###SnapRAID SYNC [`date`]"
133134
mklog "INFO: SnapRAID SYNC Job started"
134135
if [ $PREHASH -eq 1 ]; then
@@ -143,7 +144,7 @@ function main(){
143144
mklog "INFO: SnapRAID SYNC Job finished"
144145
JOBS_DONE="$JOBS_DONE + SYNC"
145146
# insert SYNC marker to 'Everything OK' or 'Nothing to do' string to differentiate it from SCRUB job later
146-
sed_me "s/^Everything OK/SYNC_JOB--Everything OK/g;s/^Nothing to do/SYNC_JOB--Nothing to do/g" "$TMP_OUTPUT"
147+
sed_me "s/^Everything OK/**SYNC JOB - Everything OK**/g;s/^Nothing to do/**SYNC JOB - Nothing to do**/g" "$TMP_OUTPUT"
147148
# Remove any warning flags if set previously. This is done in this step to take care of scenarios when user
148149
# has manually synced or restored deleted files and we will have missed it in the checks above.
149150
if [ -e $SYNC_WARN_FILE ]; then
@@ -157,12 +158,13 @@ function main(){
157158
# YES, first let's check if delete threshold has been breached and we have not forced a sync.
158159
if [ $CHK_FAIL -eq 1 -a $DO_SYNC -eq 0 ]; then
159160
# YES, parity is out of sync so let's not run scrub job
161+
echo
160162
echo "Scrub job is cancelled as parity info is out of sync (deleted or changed files threshold has been breached). [`date`]"
161163
mklog "INFO: Scrub job is cancelled as parity info is out of sync (deleted or changed files threshold has been breached)."
162164
else
163165
# NO, delete threshold has not been breached OR we forced a sync, but we have one last test -
164-
# let's make sure if sync ran, it completed successfully (by checking for our marker text "SYNC_JOB--" in the output).
165-
if [ $DO_SYNC -eq 1 -a -z "$(grep -w "SYNC_JOB-" $TMP_OUTPUT)" ]; then
166+
# let's make sure if sync ran, it completed successfully (by checking for our marker text "SYNC JOB -" in the output).
167+
if [ $DO_SYNC -eq 1 -a -z "$(grep -w "SYNC JOB -" $TMP_OUTPUT)" ]; then
166168
# Sync ran but did not complete successfully so lets not run scrub to be safe
167169
echo "**WARNING** - check output of SYNC job. Could not detect marker. Not proceeding with SCRUB job. [`date`]"
168170
mklog "WARN: Check output of SYNC job. Could not detect marker. Not proceeding with SCRUB job."
@@ -179,7 +181,7 @@ function main(){
179181
echo
180182
JOBS_DONE="$JOBS_DONE + SCRUB"
181183
# insert SCRUB marker to 'Everything OK' or 'Nothing to do' string to differentiate it from SYNC job above
182-
sed_me "s/^Everything OK/SCRUB_JOB--Everything OK/g;s/^Nothing to do/SCRUB_JOB--Nothing to do/g" "$TMP_OUTPUT"
184+
sed_me "s/^Everything OK/**SCRUB JOB - Everything OK**/g;s/^Nothing to do/**SCRUB JOB - Nothing to do**/g" "$TMP_OUTPUT"
183185
fi
184186
fi
185187
else
@@ -203,6 +205,7 @@ function main(){
203205
# Show SnapRAID Status information if enabled
204206
if [ $SNAP_STATUS -eq 1 ]; then
205207
echo
208+
echo "###SnapRAID Status"
206209
$SNAPRAID_BIN status
207210
close_output_and_wait
208211
output_to_file_screen
@@ -231,7 +234,7 @@ function main(){
231234
# do
232235
# if [[ `smartctl -a /dev/$DRIVE | grep 'Rotation Rate' | grep rpm` ]]; then
233236
# echo "spinning down /dev/$DRIVE"
234-
# hd-idle -t $DRIVE
237+
# hd-idle -t /dev/$DRIVE
235238
# fi
236239
# done
237240
# fi
@@ -260,8 +263,6 @@ function main(){
260263
fi
261264
fi
262265

263-
#clean_desc
264-
265266
exit 0;
266267
}
267268

@@ -322,20 +323,29 @@ function sed_me(){
322323

323324
function chk_del(){
324325
if [ $DEL_COUNT -lt $DEL_THRESHOLD ]; then
325-
# NO, delete threshold not reached, lets run the sync job
326-
echo "There are deleted files. The number of deleted files, ($DEL_COUNT), is below the threshold of ($DEL_THRESHOLD). SYNC Authorized."
326+
if [ $DEL_COUNT -eq 0 ]; then
327+
echo "There are no deleted files, that's fine."
328+
DO_SYNC=1
329+
else
330+
echo "There are deleted files. The number of deleted files ($DEL_COUNT) is below the threshold of ($DEL_THRESHOLD)."
327331
DO_SYNC=1
332+
fi
328333
else
329334
echo "**WARNING** Deleted files ($DEL_COUNT) reached/exceeded threshold ($DEL_THRESHOLD)."
330335
mklog "WARN: Deleted files ($DEL_COUNT) reached/exceeded threshold ($DEL_THRESHOLD)."
331336
CHK_FAIL=1
332337
fi
333-
}
338+
}
334339

335340
function chk_updated(){
336341
if [ $UPDATE_COUNT -lt $UP_THRESHOLD ]; then
337-
echo "There are updated files. The number of updated files, ($UPDATE_COUNT), is below the threshold of ($UP_THRESHOLD). SYNC Authorized."
342+
if [ $UPDATE_COUNT -eq 0 ]; then
343+
echo "There are no updated files, that's fine."
344+
DO_SYNC=1
345+
else
346+
echo "There are updated files. The number of updated files ($UPDATE_COUNT) is below the threshold of ($UP_THRESHOLD)."
338347
DO_SYNC=1
348+
fi
339349
else
340350
echo "**WARNING** Updated files ($UPDATE_COUNT) reached/exceeded threshold ($UP_THRESHOLD)."
341351
mklog "WARN: Updated files ($UPDATE_COUNT) reached/exceeded threshold ($UP_THRESHOLD)."
@@ -345,29 +355,45 @@ function chk_updated(){
345355

346356
function chk_sync_warn(){
347357
if [ $SYNC_WARN_THRESHOLD -gt -1 ]; then
348-
echo "Forced sync is enabled. [`date`]"
358+
if [ $SYNC_WARN_THRESHOLD -eq 0 ]; then
359+
echo "Forced sync is enabled."
349360
mklog "INFO: Forced sync is enabled."
350-
361+
else
362+
echo "Sync after threshold warning(s) is enabled."
363+
mklog "INFO: Sync after threshold warning(s) is enabled."
364+
fi
351365
SYNC_WARN_COUNT=$(sed 'q;/^[0-9][0-9]*$/!d' $SYNC_WARN_FILE 2>/dev/null)
352366
SYNC_WARN_COUNT=${SYNC_WARN_COUNT:-0} #value is zero if file does not exist or does not contain what we are expecting
353-
354-
if [ $SYNC_WARN_COUNT -ge $SYNC_WARN_THRESHOLD ]; then
355-
# YES, lets force a sync job. Do not need to remove warning marker here as it is automatically removed when the sync job is run by this script
356-
echo "Number of threshold warning(s) ($SYNC_WARN_COUNT) has reached/exceeded threshold ($SYNC_WARN_THRESHOLD). Forcing a SYNC job to run. [`date`]"
367+
if [ $SYNC_WARN_COUNT -ge $SYNC_WARN_THRESHOLD ]; then
368+
# force a sync
369+
# if the warn count is zero it means the sync was already forced, do not output a dumb message and continue with the sync job.
370+
if [ $SYNC_WARN_COUNT -eq 0 ]; then
371+
echo
372+
DO_SYNC=1
373+
else
374+
# if there is at least one warn count, output a message and force a sync job. Do not need to remove warning marker here as it is automatically removed when the sync job is run by this script
375+
echo "Number of threshold warning(s) ($SYNC_WARN_COUNT) has reached/exceeded threshold ($SYNC_WARN_THRESHOLD). Forcing a SYNC job to run."
357376
mklog "INFO: Number of threshold warning(s) ($SYNC_WARN_COUNT) has reached/exceeded threshold ($SYNC_WARN_THRESHOLD). Forcing a SYNC job to run."
358377
DO_SYNC=1
378+
fi
359379
else
360380
# NO, so let's increment the warning count and skip the sync job
361381
((SYNC_WARN_COUNT += 1))
362382
echo $SYNC_WARN_COUNT > $SYNC_WARN_FILE
363-
echo "$((SYNC_WARN_THRESHOLD - SYNC_WARN_COUNT)) threshold warning(s) until the next forced sync. NOT proceeding with SYNC job. [`date`]"
364-
mklog "INFO: $((SYNC_WARN_THRESHOLD - SYNC_WARN_COUNT)) threshold warning(s) until the next forced sync. NOT proceeding with SYNC job."
365-
DO_SYNC=0
383+
if [ $SYNC_WARN_COUNT == $SYNC_WARN_THRESHOLD ]; then
384+
echo "This is the **last** warning left. **NOT** proceeding with SYNC job. [`date`]"
385+
mklog "This is the **last** warning left. **NOT** proceeding with SYNC job. [`date`]"
386+
DO_SYNC=0
387+
else
388+
echo "$((SYNC_WARN_THRESHOLD - SYNC_WARN_COUNT)) threshold warning(s) until the next forced sync. **NOT** proceeding with SYNC job. [`date`]"
389+
mklog "INFO: $((SYNC_WARN_THRESHOLD - SYNC_WARN_COUNT)) threshold warning(s) until the next forced sync. **NOT** proceeding with SYNC job."
390+
DO_SYNC=0
366391
fi
392+
fi
367393
else
368394
# NO, so let's skip SYNC
369-
echo "Forced sync is not enabled. Check $TMP_OUTPUT for details. NOT proceeding with SYNC job. [`date`]"
370-
mklog "INFO: Forced sync is not enabled. Check $TMP_OUTPUT for details. NOT proceeding with SYNC job."
395+
echo "Forced sync is not enabled. Check $TMP_OUTPUT for details. **NOT** proceeding with SYNC job. [`date`]"
396+
mklog "INFO: Forced sync is not enabled. Check $TMP_OUTPUT for details. **NOT** proceeding with SYNC job."
371397
DO_SYNC=0
372398
fi
373399
}
@@ -389,23 +415,6 @@ function chk_zero(){
389415
fi
390416
}
391417

392-
function service_array_setup() {
393-
if [ -z "$SERVICES" ]; then
394-
echo "Please configure services"
395-
else
396-
echo "Setting up service array"
397-
read -a service_array <<<$SERVICES
398-
fi
399-
}
400-
401-
function clean_desc(){
402-
# Cleanup file descriptors
403-
exec >&{out} 2>&{err}
404-
405-
# If interactive shell restore output
406-
[[ $- == *i* ]] && exec &>/dev/tty
407-
}
408-
409418
function prepare_mail() {
410419
if [ $CHK_FAIL -eq 1 ]; then
411420
if [ $DEL_COUNT -ge $DEL_THRESHOLD -a $DO_SYNC -eq 0 ]; then
@@ -432,10 +441,10 @@ function prepare_mail() {
432441
MSG="Sync forced with multiple violations - Deleted files ($DEL_COUNT) / ($DEL_THRESHOLD) and changed files ($UPDATE_COUNT) / ($UP_THRESHOLD)"
433442
fi
434443
SUBJECT="[WARNING] $MSG $EMAIL_SUBJECT_PREFIX"
435-
elif [ -z "${JOBS_DONE##*"SYNC"*}" -a -z "$(grep -w "SYNC_JOB-" $TMP_OUTPUT)" ]; then
444+
elif [ -z "${JOBS_DONE##*"SYNC"*}" -a -z "$(grep -w "SYNC JOB -" $TMP_OUTPUT)" ]; then
436445
# Sync ran but did not complete successfully so lets warn the user
437446
SUBJECT="[WARNING] SYNC job ran but did not complete successfully $EMAIL_SUBJECT_PREFIX"
438-
elif [ -z "${JOBS_DONE##*"SCRUB"*}" -a -z "$(grep -w "SCRUB_JOB-" $TMP_OUTPUT)" ]; then
447+
elif [ -z "${JOBS_DONE##*"SCRUB"*}" -a -z "$(grep -w "SCRUB JOB -" $TMP_OUTPUT)" ]; then
439448
# Scrub ran but did not complete successfully so lets warn the user
440449
SUBJECT="[WARNING] SCRUB job ran but did not complete successfully $EMAIL_SUBJECT_PREFIX"
441450
else

0 commit comments

Comments
 (0)