summaryrefslogtreecommitdiff
path: root/content/blog/bacula-lto4-backup.md
blob: 6a91978a6ada51eab90f830d50f66165a1c15cd8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
+++
date = "2021-08-04T12:10:05+01:00"
description = ""
title = "LTO-4 Backup with Bacula"
categories = [ "Backup", "Bacula", "Tape", "LTO-4" ]
thumbnail = "/images/blog/bacula-lto4-backup/lto4.jpg"
+++

## Intro

With Archlinux32 reaching some terabytes of data to backup I needed
something "modern", like a tape. Now, big tapes like LTO-8 are close
to unaffordable, LTO-4 drives and tapes on the other hand can be
aqcuired on the cheap. They get thrown out of servers rooms at the moment.

An LTO-4 tape can take 800 GB uncompressed data and the drive can be bought
on Ebay for 200 to 300 CHF. Media is affordable at ca. 40 CHF per tape.

# Tape Drive

My first drive I ordered was advertised as working, which proved to
be more the kind of drive only able to produce squealing noises and
to be really hungry for tapes (and killing them). Well, my plan was
to be under 1000 CHF for a backup solution, so I simply ordered a
second one, keeping the first one for spare parts. Both drives are a
HP Ultrium 1840.

The second drive turned out to work just fine. But now, trying to connect, it
showed some issues. The tape drive comes in a noisy black box, which
I definitely don't want to run 24 hours a day. So I decided to remove
the drive and squeeze it (quite literaly) into a machine.

# Tape Media

After ordering quite the wrong tape (LTO-4 WORM, which costs more and
can be written only once, but has "smartness" built in to be tamperfree,
oh well), I got boxes and boxes of old tapes from Ricardo from somebody
desperately trying to get rid of them. Which is cool with me. The
price per tape dropped to around 20 CHF this way and I have more tapes
than I could ever have wished for.

# Connectivity Issues

I tried several SCSI cards to connect to the drive. The drive uses
the last generation of parallel SCSI, which is quite a nuisance to
find cards for. Either SCSI cards are server-grade (PCI-X) or they
are not fast enough. Some cards (like dedicated backup SCA host
adapters) work fine in some machines, but not in others. The SCSI
cables are prone to transmition errors, especially a 4 meter long
external SCSI cable with 320 MHz (external or internal 68-pin LVDS)
is somehow not really reliable at this high speed.

I went with a short shielded internal SCSI cable and putting the drive
as close to the SCA host adapter as possible. This provided the best
results.

The result looks like this:

{{< figure src="/images/blog/bacula-lto4-backup/fitting.jpg" alt="LTO-4 drive fitting the machine" >}}

# Manual Backups and Tools

## tar

Most people nowadays don't know anymore what the 't' in 'tar' stands
for - you guessed it: '***t***ape ***ar***chive'. :-)

There are other formats but usually the "rule-of-least-surprise" applies
here, the simpler the command line parameters used and the more widespread
the format, the more likely somebody else (or even you yourself) is able
to actually read and restore the data.

## mt

The old magnetic tape tool is no longer available as binary package
on Archlinux, but there is an AUR package 'mt-st-git' providing the
'mt-st' binary.

This tool you need to do basic operations on the tape like positioning,
ejecting, setting compression levels, etc.

## Some use cases

### Rewind and eject

```
mt-st -f /dev/nst0 rewoffl
```

### Erasing tapes

```
mt-st -f /dev/nst0 defcompression 0
mt-st -f /dev/nst0 compression 0
mt-st -f /dev/nst0 rewind
tar -cvf /dev/nst0 /dev/null
mt-st -f /dev/nst0 rewind
```

***Note***

I'm disabling compression on the tapes for several reasons:
* with compression on I'm not able to deliver enough data, resuling in shoe-shining
* the remaining size of a tape is so much more predictable
* I have enough tapes anyway. :-)

### Append to end of data

```
mt-st -f /dev/nst0 eom
tar zcvf /dev/nst0 *
```

### Status of the drive, current position of the tape

```
mt-st -f /dev/nst0 status
```

## sg_logs

This tool can give you all kind of internal information like
temperature, I/O errors of the drive, media information.

The first page of information serves as sort of an index, of
what the drive can report:

```
shell> sg_logs /dev/nst0 -p 0

    HP        Ultrium 4-SCSI    B32D
Supported log pages  [0x0]:
    0x00        Supported log pages [sp]
    0x02        Write error [we]
    0x03        Read error [re]
    0x0c        Sequential access device [sad]
    0x0d        Temperature [temp]
    0x11        DT Device status [dtds]
    0x12        Tape alert response [tar]
    0x13        Requested recovery [rr]
    0x18        Protocol specific port [psp]
    0x2e        Tape alert [ta]
    0x30        Tape usage (lto-5, 6) [tu_]
    0x31        Tape capacity (lto-5, 6) [tc_]
    0x32        Data compression (lto-5) [dc_]
    0x33        Write errors (lto-5) [we_]
    0x34        Read forward errors (lto-5) [rfe_]
    0x35        DT Device Error (lto-5, 6) [dtde_]
    0x3e        Device Status (lto-5, 6) [ds_]
```

For instance I can get the temperature of the drive with:

```
shell> sg_logs /dev/nst0 -p 13

    HP        Ultrium 4-SCSI    B32D
Temperature page  [0xd]
  Current temperature = 47 C
  Reference temperature = <not available>

```

This could be meshed into a nagios check script, checking the sanity
of the drive, but then I have to manually unmount the tape pool in
bacula-sd before each check.

## socat

socat is like netcat and more. It allows to build tunnels between
machines, so that the 'tar' command can pack files on one machine
and send them to another machine, where the tape write command is
attached to a listening socat.

```
# on the machine with the files to backup
tar cvf - * | socat - TCP4:<server_with_tape>:8080
# on the machine where the tape is
socat TCP4-LISTEN:8080 - | dd of=/dev/nst0 bs=10240 status=progress
```

***Note***

If using dd I set the blocksize manually to 20*512=10240, this seems
to be the standard blocksize of 'tar' on Linux.

## mbuffer

Writting directly to the tape has some drawbacks as the tape drive
is very fast and you cannot deliver data fast enough over a 1GBit/s
network. So here 'mbuffer' helps to at least buffer data for some time
and then flash it in one burst to the tape drive. This avoids the
dreadful "shoe-shining" which not only drives you crazy (the sounds of it),
but also reduces the lifetime of the components (or at least of the
mechanics of the tape drive):

```
tar cvf - * | mbuffer -m 2G -P100% | \
	socat - TCP4:<server_with_tape>:8080
socat TCP4-LISTEN:8080 - | mbuffer -m 2G -P100% | \
	dd of=/dev/nst0 bs=10240 status=progress
```

Buffering on either side is possible, not sure if having a buffer on both sides
improves anything.

# Use Cases

## Full Backup

I did a full backup of everything onto 10 tapes with the
'tar/socat/mbuffer/dd' method.

This is data which is quite stable and never changes, so I'll just keep
it on some tapes with the write protection label on. It doesn't make
much sense to put them into a bacula job, as the retention period is
basically 30 years or so - or till the tape dies.

The index of the tape is a simple text file, noting the kind of data,
the size, the tape number, the file number (offset on tape) and the
date of the backup:

```
doc             946M    1       0       17.4.2021
Attic           13G     1       1       17.4.2021
bilder          19G     1       2       17.4.2021
projects        29G     1       3       18.4.2021
ARCHIVE         16G     1       5       18.4.2021
BACKUPS         122G    1       6       18.4.2021
...
music           154G    4       0       19.4.2021
movies part1    547G    5       0       19.4.2021
movies part2    785G    6       0       20.4.2021
```

## Bacula

I use bacula for the daily incremental and full backups now for tape
and offline cloud storage.

bacula-sd just works fine and integrates with the rest of my backup system
(the master bacula-dir is still living on an old Raspberry Pi). The only
thing I was missing is to be able to copy a bacula job to two different
media, one being the remote cloud storage and the other one the tape.
Sort of a bacula 'tee' would be nice to have.

# References

* http://cdrtools.sourceforge.net/private/portability-of-tar-features.html:
  on tar formats and compatibility
* https://copyconstruct.medium.com/socat-29453e9fc8a6: blog about socat
* https://www.commandlinefu.com/commands/view/13582/backup-to-lto-tape-with-progress-checksums-and-buffering
* https://aur.archlinux.org/packages/mt-st-git/