Class FailoverFileReplicationManager
In failover mode, only one replication directory is maintained and it is updated in-place. No space saving techniques are applied. A data index is never used in failover mode.
In backup mode, multiple directories (on per date) are maintained. Also, regular files are hard linked between directories (and optionally the data index).
Compression may be enabled, which will compare chunks of files by MD5 hash and GZIP compress the upload stream. This will save networking at the cost of additional CPU cycles. This is generally a good thing, but when your network is significantly faster than your processor, it may be better to turn off compression.
The compression also assumes that if the MD5 matches in the same chunk location, then chunk has not changed. To be absolutely sure there is no hash collision the compression must be disabled.
Files are compared only by modified time and file length. If both these are the same, the file is assumed to be the same. There is currently no facility for full checksumming or copying of data.
When the data index is enabled, the underlying filesystem must have the
capabilities of ext4
or better (support 2^16 sub directories
and over DataIndex.FILESYSTEM_MAX_LINK_COUNT
links to a file).
To minimize the amount of meta data updates, old backup trees are recycled and used as the starting point for new backups. This dramatically improves the throughput in the normal case where most things do not change.
The data index may be turned on and off for a partition. Newly created backup directory trees will use the format currently set, but will also recognize the existing data in either format.
When data index is off, each file is simply stored, in its entirety, directly in-place. If the file contents (possibly assumed only by length and modified time) and all attributes (ownership, permission, times, ...) match another backup directory, the files will be hard linked together to save space.
When the data index is enabled, each file is handled one of three ways:
- If the file is empty, it is stored directly in place not using the dataindex. Also, empty files are never hard linked because no space is saved by doing so.
- If the filename is less than
MAX_NODIR_FILENAME
in length:- The file is represented by an empty surrogate in it's original location
- A series of hard linked data chunks with original filename as prefix
- If the filename is too long:
- A directory is put in place of the filename
- An empty surrogate named "<A<O<SURROGATE>O>A>" is created
- A series of hard linked data chunks
A surrogate file contain all the ownership, mode, and (in the future) will represent the hard link relationships in the original source tree.
During an expansion process, the surrogate might not be empty as data is put back in place. The restore processes resume where they left off, even when interrupted.
Data indexes are verified once per day as well as a quick verification on start-up.
depending on the length of the filename. 16 TiB = 2 ^ (10 + 10 + 10 + 10 + 4) = 2 ^ 44 Each chunk is up to 1 MiB: 2 ^ 20 Maximum number of chunks per file: 2 ^ (44 - 20): 2 ^ 24 TODO: filename<A<O<S>O>A>... TODO: Can't have any regular filename from client with <A<O<CHUNK>O>A> pattern. TODO: Can't have any regular file exactly named "<A<O<SURROGATE>O>A>"
TODO: Handle hard links (pertinence space savings), and also meet expectations. Our ParallelPack/ParallelUnpack are a good reference.
TODO: Need to do mysqldump and postgresql dump on preBackup
TODO: Use LVM snapshots within the client layer
TODO: Support chunking from either data set: current file or in linkToRoot, also possibly try to guess the last temp file? This would allow to not have to resend all data when a chunked transfer is interrupted. This would have the cost of additional reads and MD5 CPU, so may not be worth it in the general case; any way to detect when it is worth it, such as a certain number of chunks transferred?
TODO: Support sparse files. In simplest form, use RandomAccessFile to write new files, and detect sequences of zeros, possibly only when 4k aligned, and use seek instead of writing the zeros. Could also build the zero detection into the protocol, which would put more of the work on the client and remove the need for MD5 and compression of the zeros, at least in the case of full 1 MiB chunks of zeros.
- Author:
- AO Industries, Inc.
- See Also:
-
Nested Class Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
Checks a path for sanity.static void
checkSymlinkTarget
(String target) Checks a symlink target for sanity.static void
failoverServer
(Socket socket, StreamableInput rawIn, StreamableOutput out, AoservDaemonProtocol.Version protocolVersion, int failoverFileReplicationPkey, String fromServer, boolean useCompression, short retention, String backupPartition, short fromServerYear, short fromServerMonth, short fromServerDay, List<Server.Name> replicatedMysqlServers, List<String> replicatedMysqlMinorVersions, int quotaGid) Receives incoming data for a failover replication.getActivity
(Integer failoverFileReplicationPkey) static void
start()
-
Method Details
-
checkPath
Checks a path for sanity.- Must not be empty
- Must start with '/'
- Must not contain null character
- Must not contain empty path element "//"
- Must not end with '/', unless is the root "/" itself
- Must not contain "/../"
- Must not end with "/.."
- Must not contain "/./"
- Must not end with "/."
- Throws:
IOException
-
checkSymlinkTarget
Checks a symlink target for sanity.- Must not be empty
- Must not contain null character
- Throws:
IOException
-
getActivity
public static FailoverFileReplicationManager.Activity getActivity(Integer failoverFileReplicationPkey) -
failoverServer
public static void failoverServer(Socket socket, StreamableInput rawIn, StreamableOutput out, AoservDaemonProtocol.Version protocolVersion, int failoverFileReplicationPkey, String fromServer, boolean useCompression, short retention, String backupPartition, short fromServerYear, short fromServerMonth, short fromServerDay, List<Server.Name> replicatedMysqlServers, List<String> replicatedMysqlMinorVersions, int quotaGid) throws IOException, SQLException Receives incoming data for a failover replication. The critical information, such as the directory to store to, has been provided by the master server because we can't trust the sending server.- Parameters:
backupPartition
- the full path to the root of the backup partition, without any hostnames, packages, or namesquotaGid
- the quota_gid or-1
for no quotas- Throws:
IOException
SQLException
-
start
- Throws:
IOException
SQLException
-