Classification of corporate backup systems. Data backup programs. As usual we come to complex systems

Backup Software.

Purchasing suitable equipment is a necessary, but not sufficient condition for building a backup infrastructure. Another important part of the problem isselection of specialized software that will serve as a logical basis for protecting data from destruction.

If you need to back up a single user's files, it is usually sufficient to use standard utilities, such as Ntbackup on Windows or tar on Unix systems. They can be used to set the backup method and determine whether files have changed (required when performing selective backups), but their use across the entire enterprise does not seem appropriate.

For small companies, you can often do without special software at all. For backup with the minimum required functionality, it is supplied with the OS (this statement is true for both MS Windows and UNIX), and with the Oracle DBMS, for example, a truncated version of Legato Networker is supplied.

Medium and large companies need to have a well-organized backup infrastructure with high degrees of integration and automation; they have to purchase specialized software with client-server architecture.

When with corporate information systems the situation becomes significantly more complicated. They include a large number of different computers that use special technologies: file servers, database servers, etc. Backing up information on them requires special technological solutions. In addition, for corporate information systems it is important not only to preserve user information, but also to restore the functionality of computers and servers as quickly as possible in the event of any, even hardware failures. This allows you to avoid long downtime for employees and associated company losses.

It is obvious that for the successful operation of the entire backup complex it is necessary coordinated work of both software and hardware. Therefore, for enterprise-scale backup systems standard means backups are not applied. There are several important requirements that data backup and recovery software for large enterprises must satisfy:
- Building a system based on the client-server principle. Since any modern information system is based on a network, the backup system must also be network-based. Such a system should provide: backup management throughout the entire network from dedicated computers; remote backup of data contained on servers and workstations; centralized use of backup devices. When applied to backup, client-server terminology means the following: the component of the backup system that provides control of all processes and devices is called the server, and the component responsible for saving or restoring specific data is called the client. An enterprise-scale backup software product must ensure coordinated operation of all elements computer network- workstations, servers and backup devices - to ensure the least load on devices and communication channels. To do this, the following organization of the software package is used: system server, management console (in general, not installed on the server), backup agents (client programs installed on workstations). In addition, such a product must provide the ability to work with clients running different operating systems. Finally, such programs must provide access to user and database files, even if those files are open and in use by the system.
- Multiplatform. The modern information network is heterogeneous. Accordingly, the backup system must fully function in such a network, i.e. it is assumed that its server part will work in various operating environments and support clients on a variety of hardware and software platforms. Availability, at a minimum, of clients for different operating systems.
- Automation of typical operations. The backup process inevitably involves many cycles of different operations. The backup system must perform cyclic work in automatic mode and minimize the number of manual operations. In particular, it must support: scheduled backups, media rotation, scheduled maintenance of backup devices. For example, copying can be done every day on certain time. Another example of a cycle is the process of overwriting information on backup media. If the daily backup is to be kept for a week, then after this period the corresponding media can be used again. This process of sequentially replacing backup media is called rotation. Cyclic work also includes preventive maintenance of backup devices, for example, cleaning the components of the tape drive mechanism of the tape drive after a certain period of operation using a special cassette. It should be noted that automation of work is one of the key factors in reducing the cost of maintaining a backup system.
- Supports various backup modes. Let's say that every day you need to back up a certain set of files, such as those contained in the same directory. As a rule, during the working day changes are made only to separate files and daily copying of information that has remained unchanged since the previous backup was created is unnecessary. Based on this, the system must provide various backup modes, i.e., support the ability to save only the information that has been changed since the previous copy was created.
- Easy installation, support for a wide range of drives, quick recovery of network servers after a disaster. A network server may fail for various reasons, for example due to a system crash hard drive or due to software errors leading to the destruction of system information. In this case, restoring it requires reinstalling the OS, configuring devices, installing applications, restoring the file system and user accounts. All these operations are very labor-intensive, and errors may occur at any stage of this process. Thus, to restore a server, it is necessary to have a backup copy of all information stored on it, including system data, in order to bring it back to working condition as quickly as possible.
-Availability of modules for major DBMS (MS-SQL, Oracle, DB/2) and business-critical applications (MS Exchange, SAP R/3, etc.); online data backup. Often, an information system includes various client-server applications that must function around the clock. Examples of this are email systems, collaboration systems (for example, Lotus Notes) and SQL servers. It is impossible to back up the databases of such systems using conventional means, since they are open all the time. Therefore, they often have their own backup tools built in, but their use, as a rule, does not fit into the overall technology adopted by the organization. Based on this, the backup system must ensure that client-server application databases are saved online.
- Possibility of both central and local administration, developed monitoring and management tools. To manage backup processes and monitor their status, the backup system must have graphical monitoring and control tools and a wide range of event notification tools, and a function for generating and distributing reports.
Based on the requirements above, enterprise backup software must be superior to an SMB (Small/Medium Business) solution. However, it also requires significantly higher acquisition costs, as well as training costs. For this reason, when choosing a product, you should consider the advanced and additional functions and technology. For small existing solutions that can no longer be scaled up due to new requirements, all leading vendors offer software upgrades to enterprise-class products, and disk backup is considered particularly important features for large enterprises, as they significantly improve backup performance and provide additional features data protection.

Popular solutions for the corporate sector are HP Data Protector, Bakbone NetVault, BrightStor ARCserve Backup (Computer Associates), Legato NetWorker, Veritas NetBackup and some others. Many of these products are deservedly popular in Russia. All of them are designed to work in heterogeneous environments with different types of operating systems and large volumes of data and meet high requirements for performance, stability and availability. Therefore, support for storage area networks is a mandatory component of these products. Through multiplexing, enterprise backup solutions provide high performance, support multiple libraries and drives, and can be tailored to specific needs using database agents and operating systems. The type of software in question is a set of additional features that either come with the storage system or are available from third-party vendors. These typically include: creating volume snapshots (snapshots), creating a full working copy of a volume (snapclone), scheduled data replication (replication), and volume-level data mirroring to remote storage (synchronous/asynchronous mirroring).

Manufacturers of data storage systems (DSS) and storage software offer several concepts for solving this problem. This functionality can be present in the form of controller microcode (Hitachi), as an additional server module (appliance) (EMC, HP, IBM), or at the FC switch level (Cisco, Troika).

The manufacturers of Brand A data warehouses listed above zealously make sure that this functionality only works between “their own” ones, i.e. members of the same family of models. At the same time, solutions available from Cisco and Troika make virtualization transparent for any storage and are universal. However, it should be noted that both approaches are very cheap to implement and are not available to every organization.

You should also dwell on the features of choosing programs for performing archiving procedures. As with backup software, the choice of archiving software is determined by the individual needs and requirements of the business. Selection and implementation are carried out taking into account the business processes affected and the relevant legal requirements. An important point is the correct approach to the archived data sets, since often the application or type of information being archived determines the required software. The following most important selection criteria are generally accepted:
- taking into account legal aspects and legislative requirements;
- a full-fledged search system for the information array;
- ability to work with the required application;
- performance during archiving, searching and evaluation;
- support for necessary devices;
- integration into complete solution storage

Since most archiving software is application specific, some companies offer specialized solutions for classic email and ERP systems. Major manufacturers of systems for SAP include Open Text (SAP Document Access and SAP Archiving applications), IBM (DB2 CommonStore for SAP), EMC (Archive Services for SAP), Technoserv AS (Technoserv Content Server) and some others with their products for content and document management, and archiving. Integrated solutions supporting archiving and information lifecycle management of structured and unstructured data various applications in the future they will become the most rational option, since they can reduce administration costs. HP Reference Information Storage System (RISS) already supports Microsoft Exchange and Outlook, Lotus Domino and documents in file formats of MS Office applications, Adobe PDF, HTML and others.

The future evolution of backup and archiving software is driven by the trend of device virtualization, which will provide flexible sharing resources, broader and more complete application support, and the development of high-performance search capabilities. In addition, a number of developments are aimed at improving compatibility between backup and archiving software, such as shared media management. In the long term, the boundaries will become even more blurred - perhaps both storage disciplines separately will cease to exist.

ALEXEY BEREZHNOY, System Administrator. Main areas of activity: virtualization and heterogeneous networks. Another hobby besides writing articles is popularizing free software

Backup
Theory and practice. Summary

To organize a backup system most effectively, you need to build a real strategy for saving and restoring information

Backup (or, as it is also called, backup - from English word"backup") is important process in the life of any IT structure. It is a parachute for rescue in the event of an unforeseen disaster. At the same time, backup is used to create a kind of historical archive of a company's business activities over a certain period of its life. Working without a backup is like living under the open sky - the weather can turn bad at any moment, and there is nowhere to hide. But how to organize it correctly so as not to lose important data and not spend fantastic amounts of money on it?

Typically, articles on the topic of organizing backups focus mainly on technical solutions, and only occasionally attention is paid to the theory and methodology of organizing data storage.

This article will focus on just the opposite: the focus will be on general concepts, A technical means will be touched upon as examples only. This will allow us to abstract from hardware and software and answer two main questions: “Why are we doing this?”, “Can we do this faster, cheaper and more reliably?”

Goals and objectives of backup

In the process of organizing a backup, two main tasks are set: restoring the infrastructure in the event of failures (Disaster Recovery) and maintaining a data archive in order to subsequently provide access to information for past periods.

A classic example of a backup copy for Disaster Recovery is an image of the server system partition created by Acronis True Image.

An example of an archive would be the monthly download of databases from 1C, recorded on cassette tapes and subsequent storage in a specially designated place.

There are several factors that differentiate a backup for quick recovery from the archive:

  • Data storage period. For archival copies it takes quite a long time. In some cases, it is regulated not only by business requirements, but also by law. For disaster recovery copies it is relatively small. Usually they create one or two (with increased reliability requirements) backup copies for Disaster Recovery with a maximum interval of a day or two, after which they are overwritten with fresh ones. In particularly critical cases, it is possible to update the backup copy more frequently for disaster recovery, for example, once every few hours.
  • Fast access to data. The speed of access to a long-term archive is not critical in most cases. Usually the need to “raise data for the period” arises at the moment of reconciliation of documents, return to previous version etc., that is, not in emergency mode. Another thing is disaster recovery, when the necessary data and service performance must be returned as soon as possible. In this case, the speed of access to the backup is an extremely important indicator.
  • Composition of copied information. The backup copy typically contains only user and business data for a specified period. In addition to this data, the copy intended for disaster recovery contains either system images or copies of settings operating system and application software, as well as other information necessary for recovery.

Sometimes it is possible to combine these tasks. For example, a year's worth of monthly full snapshots of a file server, plus changes made during the week. True Image is a suitable tool for creating such a backup.

The most important thing is to clearly understand why the reservation is being made. Let me give you an example: a critical SQL server failed due to a disk array failure. We have the correct hardware in stock, so the only solution was to recover the software and data. The company's management asks an understandable question: “When will it start working?” – and is unpleasantly surprised to learn that it will take four whole hours to recover. The fact is that throughout the entire service life of the server, only databases were regularly backed up without taking into account the need to restore the server itself with all the settings, including the DBMS software itself. Simply put, our heroes only saved the databases and forgot about the system.

Let me give you another example. Throughout the entire period of his work, the young specialist created, using the ntbackup program, a single copy of a file server under the control of Windows Server 2003, including data and System State in a shared folder on another computer. Due to limited disk space, this copy was constantly overwritten. After some time, he was asked to restore a previous version of a multi-page report that had been damaged when saved. It is clear that, not having archived history with Shadow Copy turned off, he was unable to complete this request.

On a note

Shadow Copy, literally – “shadow copy”. Ensures that instant copies of the file system are created in such a way that further changes to the original have no effect on them. With this function it is possible to create multiple hidden copies file over a period of time, as well as on-the-fly backups of files opened for writing. The Volume Copy Shadow Service is responsible for the operation of Shadow Copy.

System State, literally – “state of the system”. System State Copy creates backup copies of critical components of Windows operating systems. This allows you to restore a previously installed system after destruction. When copying System State, the registry, boot and other files important for the system are saved, including for recovery Active Directory, Certificate Service database, COM+Class Registration database, SYSVOL directories. In UNIX operating systems, an indirect analogue of copying System State is saving the contents of the /etc, /usr/local/etc directories and other files necessary to restore the system state.

What follows from this: you need to use both types of backup: both for disaster recovery and for archival storage. In this case, it is necessary to determine the list of copied resources, the execution time of the tasks, as well as where, how and for how long the backup copies will be stored.

With small amounts of data and a not very complex IT infrastructure, you can try to combine both of these tasks in one, for example, making a daily full copy of all disk partitions and databases. But it is still better to distinguish between two goals and select the right means for each of them. Accordingly, a different tool is used for each task, although there are also universal solutions, such as the Acronis True Image package or the ntbackup program

It is clear that when defining the goals and objectives of backup, as well as solutions for implementation, it is necessary to proceed from business requirements.

When implementing a disaster recovery task, you can use different strategies.

In some cases, it is necessary to directly restore the system to bare metal. This can be done, for example, using the Acronis True Image program complete with the Universal Restore module. In this case, the server configuration can be returned to service in a very short time. short term. For example, it is quite possible to recover a partition with a 20 GB operating system from a backup in eight minutes (provided that the backup copy is accessible over a 1 Gb/s network).

In another option, it is more expedient to simply “return” the settings to the newly installed system, such as, for example, copying configuration files from the /etc folder and others in UNIX-like systems (in Windows this roughly corresponds to copying and system recovery State). Of course, with this approach, the server will be put into operation no earlier than the operating system has been installed and restored. necessary settings, which will take a much longer time. But in any case, the decision on what kind of Disaster Recovery should be stems from business needs and resource constraints.

The fundamental difference between backup and redundant redundancy systems

This is another interesting question that I would like to raise. Redundant equipment redundancy systems mean introducing some redundancy into the hardware in order to maintain functionality in the event of a sudden failure of one of the components. An excellent example in this case is a RAID array (Redundant Array of Independent Disks). In the event of a failure of one disk, you can avoid loss of information and safely replace it, saving data due to the specific organization of the disk array itself (read more about RAID in).

I have heard the phrase: “We have very reliable equipment, we have RAID arrays everywhere, so we don’t need backups.” Yes, of course, the same RAID array will protect data from destruction if one hard drive fails. But from data corruption computer virus or this will not save you from inept user actions. RAID will not save you if the file system collapses as a result of an unauthorized reboot.

By the way

The importance of distinguishing backup from redundant systems should be assessed when drawing up a plan for copying data, whether it concerns an organization or home computers.

Ask yourself why you are making copies. If we are talking about backup, then it means saving data during an accidental (intentional) action. Redundant redundancy makes it possible to save data, including backup copies, in the event of equipment failure.

There are now many inexpensive devices on the market that provide reliable backup using RAID arrays or cloud technologies(e.g. Amazon S3). It is recommended to use both types of information backup simultaneously.

Andrey Vasiliev, CEO Qnap Russia

Let me give you one example. There are cases when events develop according to the following scenario: when a disk fails, data is restored due to the redundancy mechanism, in particular, using saved checksums. In this case, there is a significant decrease in performance, the server freezes, and control is almost lost. System Administrator, seeing no other way out, reboots the server with a cold restart (in other words, clicks “RESET”). As a result of such live overload, file system errors occur. The best that can be expected in this case is that the disk check program will run for a long time to restore the integrity of the file system. In the worst case scenario, you have to say goodbye to file system and be puzzled by the question of where, how and in what time frame you can restore data and server performance.

You won't be able to avoid backups even if you have a cluster architecture. A failover cluster, in essence, maintains the functionality of the services entrusted to it if one of the servers fails. In the event of the above problems, such as a virus attack or data corruption due to the notorious “human factor,” no cluster will save you.

The only thing that can act as an inferior backup replacement for Disaster Recovery is the presence of a mirror backup server with constant data replication from the main server to the backup one (according to the Primary  Standby principle). In this case, if the main server fails, its tasks will be taken over by the backup one, and you won’t even have to transfer data. But such a system is quite expensive and labor-intensive to organize. Let’s not forget about the need for constant replication.

It becomes clear that such a solution is cost-effective only in the case of critical services with high requirements for fault tolerance and minimal recovery time. As a rule, such schemes are used in very large organizations with high commodity and cash turnover. And this scheme is an inferior replacement for backup because, anyway, if the data is damaged by a computer virus, inept user actions, or incorrect work application, data and software on both servers may be affected.

And, of course, no redundant backup system will solve the problem of maintaining a data archive for a certain period.

The concept of “backup window”

Performing a backup places a heavy load on the backed up server. This is especially true for disk subsystem And network connections. In some cases, when the copying process has a fairly high priority, this may lead to the unavailability of certain services. In addition, copying data at the time of making changes is associated with significant difficulties. Of course, there are technical means to avoid problems while maintaining data integrity in this case, but if possible, it is better to avoid such on-the-fly copying.

The solution to solving these problems described above suggests itself: to postpone the start of the copy creation process to an inactive period of time, when the mutual influence of the backup and other running systems will be minimal. This time period is called the “backup window”. For example, for an organization operating under the 8x5 formula (five eight-hour working days a week), such a “window” is usually weekends and night hours.

For systems operating according to the 24x7 formula (all week round the clock), the period of minimum activity is used as such a period, when there is no high load on the servers.

Types of backup

To avoid unnecessary material costs when organizing backups, and also, if possible, not to go beyond the backup window, several backup technologies have been developed, which are used depending on the specific situation.

Full backup (or Full backup)

It is the main and fundamental method of creating backup copies, in which the selected data array is copied entirely. This is the most complete and reliable type of backup, although it is the most expensive. If it is necessary to save several copies of data, the total stored volume will increase in proportion to their number. To prevent such waste, compression algorithms are used, as well as a combination of this method with other types of backup: incremental or differential. And, of course, a full backup is indispensable when you need to prepare a backup copy for quickly restoring the system from scratch.

Incremental copy

Unlike a full backup, in this case not all data (files, sectors, etc.) are copied, but only those that have changed since the last copy. To determine the copying time, you can use various methods For example, systems running the Windows family of operating systems use a corresponding file attribute (the archive bit) that is set when the file has been modified and cleared by the backup program. Other systems may use the date the file was modified. It is clear that a scheme using this type of backup will be incomplete if a full backup is not carried out from time to time. When performing a full system restore, you need to restore from the last copy created by Full backup, and then alternately “roll up” data from incremental copies in the order in which they were created.

What is this type of copying used for? In the case of creating archival copies, it is necessary to reduce the consumed volumes on storage devices (for example, reduce the number of tape media used). This will also minimize the time it takes to complete backup tasks, which can be extremely important in conditions where you have to work in a busy schedule 24x7 or pump large volumes of information.

There is one caveat to incremental copying that you need to know. Step-by-step recovery returns the necessary deleted files during the recovery period. Let me give you an example. Let's say that a full backup is performed on weekends, and an incremental one on weekdays. The user created a file on Monday, changed it on Tuesday, renamed it on Wednesday, and deleted it on Thursday. So, with a sequential, step-by-step data recovery for a weekly period, we will receive two files: with the old name on Tuesday before the renaming, and with a new name created on Wednesday. This happened because different incremental copies stored different versions the same file, and eventually all variants will be restored. Therefore, when sequentially restoring data from an “as is” archive, it makes sense to reserve more disk space so that deleted files can also fit.

Differential Backup

It differs from incremental in that data is copied from the last moment of Full backup. The data is stored in the archive on a “cumulative basis”. On Windows family systems, this effect is achieved by the fact that the archive bit is not reset during differential copying, so the changed data ends up in the archive copy until a full copy resets the archive bits.

Due to the fact that each new copy created in this way contains data from the previous one, this is more convenient for completely restoring data at the time of the disaster. To do this, you only need two copies: the full one and the last of the differential ones, so you can bring data back to life much faster than gradually rolling out all the increments. In addition, this type of copying is free from the above-mentioned features of incremental copying, when, with a full recovery, old files, like a Phoenix bird, are reborn from the ashes. There is less confusion.

But differential copy significantly loses to incremental in saving the required space. Since each new copy stores data from previous ones, the total volume of reserved data can be comparable to a full copy. And, of course, when planning the schedule (and calculating whether the backup process will fit into the time window), you need to take into account the time it takes to create the last, thickest, differential copy.

Backup topology

Let's look at what backup schemes there are.

Decentralized scheme

The core of this scheme is a certain general network resource(see Fig. 1). For example, a shared folder or an FTP server. A set of backup programs is also required, which from time to time uploads information from servers and workstations, as well as other network objects (for example, configuration files from routers) to this resource. These programs are installed on each server and work independently of each other. An undoubted advantage is the ease of implementation of this scheme and its low cost. Standard tools built into the operating system or software such as a DBMS are suitable as copying programs. For example, this could be the ntbackup program for the Windows family, the tar program for UNIX-like operating systems, or a set of scripts containing built-in SQL server commands for unloading databases into backup files. Another advantage is the ability to use various programs and systems, as long as they can all access the target resource for storing backup copies.

The downside is the clumsiness of this scheme. Since the programs are installed independently of each other, each one has to be configured separately. It is quite difficult to take into account the peculiarities of the schedule and distribute time intervals in order to avoid competition for the target resource. Monitoring is also difficult; the copying process from each server has to be monitored separately from others, which in turn can lead to high labor costs.

Therefore, this scheme is used in small networks, as well as in situations where it is impossible to organize a centralized backup scheme using available means. More detailed description this scheme and practical organization can be found in .

Centralized backup

Unlike the previous scheme, in this case a clear hierarchical model is used, working on the client-server principle. In the classic version, special agent programs are installed on each computer, and the server module of the software package is installed on the central server. These systems also have a specialized backend management console. The control scheme is as follows: from the console we create tasks for copying, restoring, collecting system information, diagnostics, and so on, and the server gives the agents the necessary instructions to perform these operations.

It is on this principle that most popular backup systems work, such as Symantec Backup Exec, CA Bright Store ARCServe Backup, Bacula and others (see Fig. 2).

In addition to various agents for most operating systems, there are developments for backing up popular databases and corporate systems, for example, for MS SQL Server, MS Exchange, Oracle Database and so on.

For very small companies, in some cases you can try a simplified version of a centralized backup scheme without the use of agent programs (see Fig. 3). This scheme can also be used if a special agent is not implemented for the backup software used. Instead, the server module will use already existing services. For example, “scrape” data from hidden shared folders on Windows servers or copy files via SSH from servers running UNIX systems. This scheme has very significant limitations related to the problems of saving files open for writing. As a result of such actions open files will either be missed and not included in the backup copy, or copied with errors. There are various workarounds for this problem, such as running the job again to copy only previously opened files, but none are reliable. Therefore, this scheme is suitable for use only in certain situations. For example, in small organizations working in a 5x8 mode, with disciplined employees who save changes and close files before going home. To organize such a truncated centralized scheme, operating exclusively in Windows environment, ntbackup works well. If you need to use a similar scheme in heterogeneous environments or exclusively among UNIX computers, I recommend looking towards Backup PC (see).

Figure 4. Mixed backup scheme

What is off-site?

In our turbulent, changing world, events can occur that can cause unpleasant consequences for the IT infrastructure and business as a whole. For example, a fire in a building. Or a breakdown of the central heating battery in the server room. Or the banal theft of equipment and components. One method to avoid information loss in such situations is to store backup copies in a location away from the main location of the server hardware. At the same time, it is necessary to provide a quick way to access the data necessary for recovery. The described method is called off-site (in other words, storing copies outside the territory of the enterprise). Basically, two methods of organizing this process are used.

Write data to removable media and their physical movement. In this case, you need to consider a means of quickly getting the media back in the event of a failure. For example, store them in a neighboring building. The advantage of this method is the ability to organize this process without any difficulties. The downside is the difficulty of returning the media and the very need to transfer information for storage, as well as the risk of damaging the media during transportation.

Copying data to another location over a network link. For example, using a VPN tunnel over the Internet. The advantage in this case is that there is no need to transport media with information somewhere, the disadvantage is the need to use a sufficiently wide channel (as a rule, this is very expensive) and protect the transmitted data (for example, using the same VPN). The difficulties encountered in transferring large volumes of data can be significantly reduced by using compression algorithms or deduplication technology.

It is worth mentioning separately about security measures when organizing data storage. First of all, care must be taken to ensure that the data carriers are located in a secure area and that measures are taken to prevent unauthorized persons from reading the data. For example, use an encryption system, enter into non-disclosure agreements, and so on. If removable media is used, the data on it must also be encrypted. The labeling system used should not help the attacker in analyzing the data. It is necessary to use a faceless numbering scheme for marking the media of the names of the transferred files. When transmitting data over a network, it is necessary (as already written above) to use safe methods data transmission, for example, VPN tunnel.

We have discussed the main points when organizing a backup. The next part will look at guidelines and given practical examples to create an effective backup system.

  1. Description of backup in Windows, including System State - http://www.datamills.com/Tutorials/systemstate/tutorial.htm.
  2. Description of Shadow Copy - http://ru.wikipedia.org/wiki/Shadow_Copy.
  3. Acronis official website – http://www.acronis.ru/enterprise/products.
  4. Description of ntbackup - http://en.wikipedia.org/wiki/NTBackup.
  5. Berezhnoy A. Optimizing the operation of MS SQL Server. //System administrator, No. 1, 2008 – pp. 14-22 ().
  6. Berezhnoy A. We organize a backup system for small and medium-sized offices. //System administrator, No. 6, 2009 – pp. 14-23 ().
  7. Markelov A. Linux guarding Windows. Review and installation of the BackupPC backup system. //System administrator, No. 9, 2004 – P. 2-6 ().
  8. Description of VPN – http://ru.wikipedia.org/wiki/VPN.
  9. Data deduplication - http://en.wikipedia.org/wiki/Data_deduplication.

In contact with

5 / 5 ( 2 votes)

Most often, the current system resources are not enough to perform backup procedures. And, often, you have to spend a lot of effort to justify the need to purchase additional hardware to implement the reservation procedure. Indeed, from the point of view of an ordinary user, such a situation may be unlikely and insignificant. However, it cannot be ignored.

Thus, it is necessary to initially determine the list of the most important tasks that the system will have to carry out in order to store data as efficiently as possible. To this end, it is necessary to answer a number of questions, thanks to which it will be possible to determine a number of basic necessary characteristics.

The first thing to determine is how important the data being stored is. If the data can be restored by re-downloading or re-creating, then backup operations can be performed less frequently. In case the data is very important, a more reliable backup strategy must be adopted.

The next important factor is the frequency of changes to the data. The more often the data changes, the more often the backup operation needs to be performed.

In this case, it is necessary to calculate the required amount of disk space. After all, the volumes are affected by the number of copies that must be simultaneously stored in the system.

In the case of a deployed heterogeneous enterprise information infrastructure, it may be necessary to separate information into types with different redundancy requirements.

There are several types of backup. These are full backup, differential backup, incremental backup.

A full backup is the main and fundamental method of creating backup copies, in which the selected data array is copied in its entirety. This is the most complete and reliable type of backup, although it is the most expensive. If it is necessary to save several copies of data, the total stored volume will increase in proportion to their number. To prevent a large amount of resources used, compression algorithms are used, as well as a combination of this method with other types of backup: incremental or differential. And, of course, a full backup is indispensable when you need to prepare a backup copy for quickly restoring the system from scratch.

This method has both its advantages and disadvantages. The main advantage is the ease of restoration from scratch. Since the array is completely saved, it is also not difficult to restore only part of the necessary data. One of the disadvantages is the redundancy of this method. During operation, many files may remain unchanged, however, they will also be included in the backup copy. Thus, a fairly large volume of media will be required. Not only does a full backup take up unnecessary storage space, it can also be time-consuming, especially if you have network-attached storage.

An incremental backup, unlike a full backup, does not copy all data, but only those that have changed since the last backup. Various methods can be used to determine the backup time, for example, on systems running Windows operating systems, a corresponding file attribute (archive bit) is used, which is set when the file has been modified and reset by the backup program. Other systems may use the date the file was modified. It is clear that a scheme using this type of backup will be incomplete if a full backup is not carried out from time to time. When performing a full system recovery, you need to restore from the latest copy, and then one by one restore data from incremental copies in the order in which they were created. This type used to reduce the amount of space consumed on information storage devices when creating archival copies. This will also minimize the execution time of backup jobs, which can be extremely important when the platform is constantly running. Incremental copying has one caveat: step-by-step recovery also returns unnecessary deleted files during the recovery period. Therefore, when sequentially restoring data from an archive, it makes sense to reserve more disk space so that deleted files can also fit.

One of the advantages of the method is the effective use of media. Because only files that have changed since the last full or incremental backup are saved, backups take up less space. Correspondingly, shorter backup and recovery times. Incremental backups take less time than full and differential backups.

The disadvantage of this method is that the backup data is stored on multiple media. Since backups are located on multiple media, restoring a device after a disaster may take longer. Additionally, to effectively restore the system, the media must be processed in the correct order.

Differential backup differs from incremental backup in that data is copied from the last moment a full backup was performed. The data is stored in the archive on a “cumulative basis”. On Windows family systems, this effect is achieved by the fact that the archive bit is not reset during differential copying, so the changed data ends up in the archive copy until a full copy resets the archive bits. Due to the fact that each new copy created in this way contains data from the previous one, this is more convenient for completely restoring data at the time of the disaster. To do this, you only need two copies: the full one and the last of the differential ones, so you can return lost data much faster than restoring all increments step by step. In addition, this type of copying is free from the above-mentioned features of incremental copying, when during a full recovery old files are restored unnecessarily. At this method Fewer inconsistencies occur. But differential copying is significantly inferior to incremental copying in saving the required space. Since each new copy stores data from previous ones, the total volume of reserved data can be comparable to a full copy. The disadvantage of this method, as when creating a full backup copy, is excessive data protection. All files changed since the last incremental backup are preserved.

In the process of backing up data, the problem of choosing a technology for storing backup copies and data arises. Currently, the most common types of media are: magnetic tape drives; network technologies; disk drives.

The most common type of disk drives: magnetic hard disk drives.

Hard magnetic disk drives are the main devices for operational storage of information. Regarding the server case, a distinction is made between internal and external drives. Internal drives are significantly cheaper, but their maximum number is limited by the number of free compartments in the case, the power and the number of corresponding connectors of the server's power supply. Internal drives with hot swap (HotSwap) are ordinary hard drives installed in special cassettes with connectors. Cassettes are usually inserted into special compartments on the side of the front panel of the case; the design allows the drives to be removed and inserted while the server is powered on. External drives have their own cases and power supplies; their maximum number is determined by the capabilities of the interface. Service external drives can also be done while the server is running, although it may require stopping access to some of the server’s disks.

For large volumes of stored data, external storage units are used - disk arrays and racks, which are complex devices with their own intelligent controllers that, in addition to normal operating modes, provide diagnostics and testing of their drives. More complex and reliable storage devices are RAID arrays (Redundant Array of Inexpensive Disks - a redundant array of inexpensive disks). For the user, RAID is a single disk in which simultaneous distributed redundant writing (reading) of data is performed on several physical drives (typically 4–5) according to rules determined by the implementation level (0–10).

The advantages of such drives are fast access to data and the ability to access data in parallel without significant loss of speed. Disadvantages include a fairly high cost, higher power consumption, more expensive expansion of the data storage system, and the inability to ensure high security of copies.

There is also the option of storing backup data on network storage. By and large, information will be stored on the same disk drives, only in remote storage. The only difference is that communication with it will be carried out via network technologies. The main advantage is the ease of connecting additional platforms for data storage and the need for them to be located in close proximity to the servers on which the data to be copied is located. Additionally, it is possible to set any Raid level of the array in the network storage itself, thereby providing flexibility in choosing the security level for stored data. In some cases, you don’t even have to purchase additional equipment, but place information on rented sites. The price will be significantly lower than when purchasing equipment for storing data arrays. The only inconvenience with such data placement is the low, in some cases, access speed. And the need to reserve a communication line to access the storage.

One of the cheapest (if you calculate the cost of a drive per 1 GB of data) methods of storing information is the use of tape drives. Robotic tape libraries have come a long way in recent years. Like modular disk arrays, such libraries provide flexible and, most importantly, cost-effective expansion of system capacity as the volume of data that needs to be stored on tape grows, high reliability, and also have powerful means remote control and monitoring. Even the largest libraries have limits on the number of tape cartridge slots, but if all the tape cartridges in the library are full, some of the old backups can be sent to storage and new cartridges installed in their place. For disk arrays, this option of “unlimited” scaling is impossible, if only because hard disks They are much more expensive than cartridges and are not intended for long-term storage in a disconnected state. The main disadvantage of such storage systems can be considered the presence of a mechanical part for accessing the required tape cartridge. The second, quite noticeable, but not critical minus is the speed of copying to tape. It is significantly lower than in disk arrays. But a way out of this situation was quickly found. Libraries are equipped disk arrays and initially the necessary data is written off to it, and then transferred to magnetic tape, without in any way interfering with the operation of the system.

As mentioned above, backup alone is not enough. A set of measures to prevent emergency situations. All components of the backup infrastructure must be considered in the planning process, and all applications, servers, and trends in primary storage capacity must not be ignored.

Reviewing error logs and backup progress is a necessary daily task. Backup problems tend to happen in an avalanche. A single failure can lead to a whole sequence of seemingly unrelated difficulties. For example, a backup job may either hang or fail to start because the required tape drive was not freed by a previous job. Such situations require immediate intervention and adjustment of the backup process in order to avoid future problems with the lack of necessary copies.

All backup applications maintain their own database or directory, which is necessary for subsequent recovery of saved data. Loss of the directory results in loss of saved data. Although some backup applications have mechanisms to correctly read tapes and indexes for recovery, this can be an overwhelming task. Such a directory should be treated like any other mission-critical database application. It is advisable to have a mirror copy of it or at least store it in a RAID system. In addition, it is advisable to ensure that the directory is saved according to schedule and without errors. Database corruption can also cause unwanted data loss.

Also, in real systems, data must be differentiated. The person responsible for backup must clearly understand how the system works and distinguish between data types. As stated above, some data can be backed up less frequently, while others need to be backed up more often. Scheduling and frequency of copying is one of the most important tasks. But, even taking into account the diversity of data, the system should be as automated and centralized as possible.

Another important procedure is checking the created copies for reading. Indeed, sometimes backups, for one reason or another, may not be readable. And it is advisable to discover this fact not at the moment when you need a copy for recovery.

Thus, a clearly developed and planned backup planning strategy will be able to completely eliminate and prevent the occurrence of emergency situations and data loss, thereby ensuring the full and uninterrupted functioning of the enterprise’s information infrastructure.

There are no similar articles.

The backup subsystem is a very important part of any corporate information system. When properly organized, it can solve two problems at once. So today we'll look at a few PC programs for organizing information backup in local networks of different sizes.

Subsystem Reserve copy- a very important part of any corporate information system. When properly organized, it can solve two problems at once. Firstly, reliably protect the entire range of important data from loss. Secondly, to organize a quick migration from one PC to another if necessary, that is, to actually ensure uninterrupted work for office employees. Only in this case can we talk about the effective operation of corporate backup.

It is clear that not every product is suitable for organizing such an archiving system. There are a large number of backup programs on the Russian market. However, almost all of them are intended mainly for home users and SOHO. The market for corporate systems is noticeably smaller. However, there is a choice there too. Therefore, today we will analyze several programs for organizing information backup in local networks of different sizes. Please note that we will not consider the “basic” capabilities of such systems (working on a schedule, compression and encryption of archives, etc.). Initially, it is believed that all professional products have them. We will consider only “corporate” functions and compare programs in the review based on them.

Acronis Backup & Recovery 10 Workstation

company Acronis Inc. and its products probably don’t need to be introduced to anyone. It is one of the leaders in the Russian backup systems market. Its arsenal includes a number of programs designed for different consumers. This includes a series of corporate products. It includes several server systems and two workstation programs - and Acronis Backup & Recovery 10 Workstation Advanced, both products are published in Russia under the 1C: Distribution brand. The first is more suitable for small offices. The second one is the most functional. This is what we chose for our review.

Let's look at the "corporate" features of the product, first noting the organization Reserve copy. Here we can talk about the system of duplicating archives created by the program, that is, information can be simultaneously copied to two storages, for example, to a network and local. This approach allows for “backup copying” of the archives themselves, and this in turn increases the reliability of the backup system without additional costs (purchasing additional capacities for storing archives), especially since most often on hard drives office computers there is a lot of free space.

The next important point is the deduplication module (it is purchased separately). Deduplication is one of the most effective ways reducing the volume of backup copies by eliminating duplicates - identical information. In some cases, with its help you can significantly reduce the volume of NAS required for storing archives, and therefore significantly reduce the cost of purchasing and maintaining equipment, as well as paying for electricity. It is worth noting that many backup programs, including the product in question, have other means for reducing the size of archives: data compression, excluding certain types of files from them, incremental copy, automatic deletion of outdated copies, etc. However, deduplication has one very important feature. If all other methods work exclusively with one source, then it is capable of finding duplicate data on all workstations and servers on the network. But it often happens that employees copy some data to their machines for ease of work. As a result, in corporate network there may be dozens of copies of the same data. This is where deduplication can be of significant help.

Next, you can consider functions aimed at ensuring smooth work for employees. We are talking about situations related to failure of software or hardware work computers. The first problem is solved by creating operating system images with all installed and configured software. If necessary, it deploys over the damaged OS in just a few minutes. But that's not all - similar functionality is found in many “home” products. IN Acronis Backup & Recovery 10 Workstation The functionality for working with images has been significantly expanded to help solve the problem of hardware damage. Firstly, it is complemented by a virtualization system. The created image can be converted to a virtual machine format at any time (VMware, Microsoft Hyper-V, Citrix XenServer and Parallels formats are supported). Then all that remains is to run this virtual machine on any PC, and the employee will be able to continue working in their familiar environment. Secondly, at Acronis Backup & Recovery 10 Workstation There is a Universal Restore module (purchased separately). It provides the ability to deploy an image on any computer, regardless of its hardware. By the way, this module can help not only to transfer the system in the event of a PC failure, but also to quickly deploy new workspaces.

The next very important aspect of the functioning of a corporate backup system is integration into an existing information system enterprises. For this purpose in Acronis Backup & Recovery 10 Workstation Several tools are provided. It is especially worth noting the automatic execution of commands before the start of the reservation procedure and after its completion. With their help, you can, for example, stop some system while copying, and then start it again. Of course, such a need does not arise so often, but sometimes it does happen (especially when working with some special software). Also, in some cases, the ability to work from command line with script support.

We must not forget about the procedures for deploying and administering the system. Reserve copy. To increase IT staff efficiency and reduce service costs in Acronis Backup & Recovery 10 Workstation Several very important functions were implemented at once. First of all, it is worth noting the possibility of remote installation of the program on workstations. Moreover, it can be carried out in the background (for users) and without rebooting. This allows you to very quickly deploy a backup system in a network of almost any size, without stopping the work of the office.

In addition, the product in question has the ability to centrally manage reservations. Moreover, it is not limited to a primitive connection to workstations with subsequent “manual” configuration of the agent. We are talking about full-fledged administration based on creating policies for entire groups of workstations. Separately, it is worth noting the support for Wake-on-LAN technology, which allows you to turn on switched off machines before starting the archiving process. But that's not all. Acronis Backup & Recovery 10 Workstation provides IT department employees with ample opportunities to manage backups located in various storage locations. Administrators can check archives, merge backups manually, or configure this procedure to run automatically.


Today, there are many software products to provide data backup technology. At the corporate level, products such as:

Acronis True Image Home.

Paragon Drive Backup Server Edition.

Symantec Backup Exec.

Windows System Recovery.

For network backup:

Paragon Drive Backup Enterprise Server Edition.

Acronis Backup & Recovery.

Further review of backup technologies will be based on the description practical use the following three software products:

Paragon Drive backup Workstation.

Acronis True Image Home.

GFI backup program overview

General characteristics.

System requirements:

Microsoft Windows 7 (x86 or x64), Server 2008

(x86 or x64), Vista (x86 or x64), Server 2003 Standard/Enterprise

(x86 or x64), XP (x86 or x64)

Processor - Intel Pentium 4 or similar

Memory - 512 MB

Physical memory - 100 MB for installation

Characteristics:

1. Safe and reliable data backup and recovery.

GFI backup provides centralized management of backup and recovery as information loss protection, preventing the loss of data such as spreadsheets, projects and images. This process involves creating a backup from the source to a selected location.

2. Data synchronization.

File synchronization is the process of maintaining a current set of files across multiple locations, such as a workstation and laptop. If a user adds, deletes or modifies a file in one location, GFI Backup adds, deletes or modifies the same file in all other locations. Using the GFI Backup Agent, users can create their own synchronization tasks in addition to centralized backup operations.

3. Backup to any data storage device; backup via FTP.

GFI Backup allows you to back up to internal and external hard drives, to drives in local network, network devices data storage, media

CD/DVD/Bluray, portable devices (USB devices, memory cards, flash memory, floppy disks, etc.), and to remote locations using FTP with automatic renewal system.

6. Using standard Zip archives.

Unlike other backup programs, GFI Backup does not use its own archive formats, but uses standard Zip format. This allows

restore data manually even if GFI Backup is not installed. You can choose to create self-extracting archives, as well as backup without data compression for speed and redundancy. When using Zip archives, GFI Backup is able to split and save files onto multiple media.