You are currently browsing the archives for the online backups tag.

Securing your online backup archives

May 12th, 2010

Bookmark and Share
One of the concerns of many people who consider performing online backups is the matter of security. You are uploading sensitive stuff to a foreign site. Can anyone from within read this stuff? And what if the site is hacked and white collar thieves living in some foreign country get hold of the data? What would happen?

One solution is to protect each and every document using a password. Many programs have such a capability built in.  For many one, two or three person organisations this solution could work; the people would password protect every file using a phrase that is shared amongst colleagues. As the number of employees increase, guaranteeing that everyone is obeying the rules makes this solution one that is too problematic. Besides certain file types cannot be password protected.

The script I am sharing is one that addresses this problem. It makes use of the commercial product WinRar to archive an entire directory (including subdirectories) into a RAR file. The RAR file name is user definable and is placed in a folder under C:\RSB. The RAR archive is password protected using a password passed to the script. The script is called rsb.cmd.

Custom Search

Read more »

Remote online backup providers

May 10th, 2010

Very recently, I read an article by W. Curtis Preston (at SearchDataBackup.com) about remote online backup services. As this blog, deals with data backups I would like to comment about some points. W. Curtis Preston is an executive editor at TechTarget and independent backup expert. 

Preston starts his article by pointing out the major setback of remote online backups that is the first full backup execution time and mentions the seating option. The seating option is when a service provider offers the customer to shift a complete set of all data through a physical means such as, removable drives. In addition, he mentions the long backup time that the initial full backup job would take and the bandwidth limitations associated with large uploads of data. This is true and I would consider these factors as the only limitations of remote online backups as we stressed out in our previous articles – click here for more articles - In fact, we both agree that home users and SMBs are the ideal customers of these services.

Read more »

Custom Search

An intro to data deduplication

March 23rd, 2010

deltaData deduplication is a data backup process that eliminates duplicated data. At first thought, one may think that the word deduplication means the negation of duplication! Although, this is what happens in data deduplication that connotation is not the right one! Actually, the word deduplication means the division of that which is one whole into two or more pieces. In fact, the data deduplication mechanism divides data into blocks or chunks of bits in order to eliminate the redundant pieces within data.

Read more »

Remote backup management consoles

February 15th, 2010


A remote or online backup solution is the way forward for off-site data protection. Due to regulatory compliance some corporations are holding back from going to this direction, however, much work is being made in this area and soon we have providers that will provide such conformity. On the other hand, the majority of SMBs that have no specific regulatory requirements and certainly most households should consider this platform as their main off-site backup solution.console

Nevertheless, SMBs and households should not forget to backup their data locally first and then use a remote storage location as a second means of protection - my advice is:

 

Read more »

Total Cost of Ownership of Data Backups

January 19th, 2010

moneyAre online backup and recovery solutions cheaper than the counterpart traditional solutions? Before I deal with this argument I would like to point out a few points: – online backups provide an offsite disaster recovery solution, you can access your data from anywhere given that you have an internet connection and additionally, you will be enjoying the expertise and the scalability of big vendors.

Online backups offer cheaper costs per GB for the same functionality because you only pay for what you use. :) The costs include the storage used, bandwidth consumed and other related services. Data security is based on the latest encryption algorithms and adequate auditing features would place the end-user’s mind at rest!

Read more »

Are online backups for your computer a safe idea?

December 23rd, 2009

Are cars safe? Are computers safe? Is buying over the internet safe? These are some of the questions people who pose the question above might have asked a hundred, thirty and fifteen years ago. Today, many of us use cars, computers and regularly effect payments over the internet without much thought. When talking about online backups, the simple answer to the question being asked is yes.

Loony_Bin

Read more »

Go for online remote backups

November 9th, 2009

With today’s inexpensive and large-sized disk drives, disk-to-disk backups have become the default choice for many SMBs! However, off-site backup procedures require the use of tapes and hence, the use of related peripherals such as tape drives. Traditional backup methodologies bring along the need for more storage space and a sharp increase in operational costs! Although, data archives and off-site backups have their advantages, and are a must when meeting compliance regulations they must adhere to more rigid safeguards such as, encryption mechanisms and safe storage. Adequate drills need to be put in place to test recovery procedures on regular basis. Traditional methods create data duplication. To eliminate duplication of data, various methods and applications were created which are known as data de-duplication. Data de-duplication improves data protection, increases the speed of service, and reduces costs.

rembkup

Why online backups are the way forward – because they provide:

  1. User friendly backup/restore applications
  2. Native off-site and archival services
  3. Reduce secondary storage requirements through data de-duplication concepts
  4. Intelligent data transfer methods through hashing algorithms
  5. Safe – same data protection means such as, encrypted storage and connection
  6. Scalability for future growth
  7. Cheaper TCO (total cost of ownership)
  8. Introduce pay-per-use concepts
  9. Data retrieval anywhere-anytime concepts

Rsync – A Top-View Introduction

October 17th, 2009

Even with today’s ultra fast fibre optic data lines, the tangible throughput people actually get when transferring files is still functionally limited. This is compounded further by the fact that the majority of uses have low upload speeds compared to higher download speeds. For example a typical home internet connection would promise 4Mbit download while only 256Kbit would be allocated for upload. To make matters worse, today files are many times larger than their ancestors.  If your hair has greyed (or fallen off) in sufficient quantities, you probably recall a time when a 1.2MB floppy disk would hold all your documents as well as the word processing program itself with space to spare. This thousand word document is over 6 times larger than that floppy disk!

In this environment, when one discusses online backups, the critical issue is to try to limit as much as possible the amount of data that has to be uploaded from the person’s computer (the source) to that person’s archive vault (the destination).   Many people would consider online backups only if the lengthy wait periods are few and far between.

On a regularly backed up system, the state of the source compared to the destination can be as follows:

  • New file on source that does not exist on the destination – the file must be copied over to the archive vault;
  • File on source that no longer exists on the destination – the file must be removed from the archive vault;
  • File on source and destination are identical – no need to do anything;
  • File on source is different from that on destination.

On a system that is regularly backed up, one would normally find few files that are new or that have been deleted. The majority of files would either have not changed or would have been altered.  As I will demonstrate shortly, the Rsync algorithm for transferring data from source to destination is greatly suited for situations in which a file has been adjusted.  There are two ways a system can handle altered files, resending the adjusted files in its entirety or simply transferring the changed pieces. The Rsync algorithm does the latter.

The Rsync algorithm was developed by Andrew Tridgell and Paul Mackerras. Taking a data backup using the Rsync method results in ultra fast and efficient backups. Imagine a database, worksheet or word processing document in which the author only changed one record, cell or paragraph.  Take your PIM database; on a daily basis you receive new emails, delete junk and old mail, setup appointments and have the system remove expired ones.  Rsync backups will transmit a few megabytes of changes rather than let you wait until all modified files (that probably run into gigabytes) are uploaded.

The Rsync algorithm

Rsync – A Top-View Introduction1

The file on the archive server (the destination file that needs to be updated) is split into a number of blocks of equal size (the last block could be an exception). For each block a signature is generated.

Rsync – A Top-View Introduction2

The signature consists of a quick-to-compute 32 bit checksum (I’ll explain the rolling capability later on) as well as a 128 bit checksum. In Rsync, the MD5 (Message digest) algorithm is now being utilised.

The block number as well as the associated signatures for each block is transmitted to the source computer. The source computer takes the 32 bit low computational checksum of each block and generates a simple 16 bit hash value.

Simplistically speaking, the source end will transmit to the destination how to reconstruct the file using either literal data transmitted from the source to the destination or by telling the destination to utilise a block already present at its end.

 Rsync – A Top-View Introduction3

At the source, a block of identical size as those computed at the destination is analysed and the low overhead 32 bit checksum is generated. Its 16 bit hash is compared to those computed from the destination 32 bit checksums. If no match is found than the code that deals with a nonexistent block kicks in.

If the hashes match, the 128 bit checksum is computed on the source block and this is compared to the 128 bit blocks received from the destination that have the same hashing function (there may be more than one) as the source. If no match is found, the code that deals with a nonexistent block kicks in.

If a destination block is matched, the source sends an instruction to the destination to copy the block having a particular index to the new output file. The position at the source is advanced forward the length of the block and the process loops.

The logic behind the nonexistent block is as follows:

  1. It transmits the character at the beginning of the block to the destination to append to the file that is being reconstructed;
  2. Advances the block by one character and repeats the process.

 Rsync – A Top-View Introduction4

Point 2 explains where the rolling 32 bit algorithm comes into play. By having a rolling 32 bit algorithm, the computational overhead necessary to calculate the new CRC is minimised further since all that is necessary would be to subtract the value of the byte of the previous start of block and add the value of the byte at the new (shifted) end of block. The alternative would have to sum up the range again and this consumes more computing cycles.

The performance increase between a copy-all backup and one using Rsync can be more than 110 times. All data transmitted by Rsync is actually compressed and is encrypted using SSL. For online backups the Rsync method is the best way to guarantee that backups are fast, secure, and non-intrusive and actually get done.

In a future article I’ll give some practical examples to explain how Rsync deals with different file situations, namely adding new data at various positions within a file, deleting data from different positions within a file and altering data. For more in depth information on Rsync, you might want to visit the Rsync home page at http://Rsync.samba.org/ as well as spend a couple of evenings reading through Andrew Tridgell’s paper Efficient Algorithms for Sorting and Synchronization.