WHAT THIS PROGRAM DOES: ----------------------- Checksum adds additions data assurance capabilities to filesystems which support extended attributes. Checkit allows you to detect any otherwise undetected data integrity issues or file changes to any file. By storing a checksum as an extended attribute, checkit provides an easy way to detect any silent data corruption, bit rot or otherwise modified error. This was inspired by the checksumming that is performed by filesystems like BTRFS add ZFS. These filesystems ensure data integrity by storing a checksum (CRC) of data and checking read data against the checksum. With mirroring of data, they can silently heal the data should an error be found. This program does not duplicate this ability completely, but offers rudimentary checksum abilities for other filesystems. It simply calculates a checksum and stores the checksum with the file. It can then be later used to verify the data against the checksum. Any data corruption of file changes would result in a failed check. WHY IT WAS CREATED: ------------------- Moving data from disk to disk, or leaving data on the disk, leaves a very small possibility of silent data corruption. While rare, the large amounts of data being handled by drives make silent corruption a real possibility. While BTRFS and ZFS can handle this, other filesystems can't. This program was created to add an ability to detect (but not fix) issues. With the ability to detect, you can easily find out whether a copy or extract operation occured perfectly, or whether there has been bit rot in the file. Backups provide point of reference, but comparison isn't very efficient, as it involves reading two files. Using a CRC, you only need to read the file once, even after a copy operation, to determine whether the file is OK. Also, should the file be different from the backup, which copy is OK, the backup or the original? Checkit will let you know whether it is the backup or original which has changed. There are other ways to do this. You can use a cryptographic hash and store a SHA-1 or MD5 or SHA-256 value in a separate file, or even use GPG and digitally sign the file. The problem is, that the value is stored in a seperate file, and doing directory recursion, or all files of a particular type (i.e., all JPG's isn't as straighforward. With checkit, the CRC is stored as an extended attribute. It remains as part of the file, and can be copied or archived automatically with the file. No need for seperate files to store the hash/checksum. Checkit also has the ability to export this to a hidden file, and import it back into an extended attribute. HOW TO USE: ----------- Checkit calculates and stores the CRC as an extended attribute (user.crc64) or as a hidden file. The file must reside on a filesystem which supports extended attributes (XFS, JFS, EXT2, EXT3, EXT4, BTRFS, ReiserFS among others) to use the extended attribute (recommended). Checkit allows you to determine whether the checksum is 'read only' or 'read write', using the -d and -u options. A 'read only' checksum will not be updated, unless the '-o' overwrite option is used. This way, the checksum is not inadvertantly updated when rerunning checkit over it. A 'read write' checksum can be updated, and can be used in those cases where you want to rerun checkit over the file, and recompute the checksum because of intended changes. For the most part, checkit is intended for files you want to remain static, such as your precious family photos and movies, archives and such.