When a SQL Server instance has a large number of databases (100s or 1000s), it can be very challenging to complete backups in a timely manner. Using common methods such as Ola Hallengren scripts or native maintenance plans often take a long time due to serial processing (1 processor). So the question becomes, how do we perform backups in parallel using multiple processors?
With any backup automation, there’s a few core requirements that we’ll keep in mind while developing this solution:
* Ensure newly created databases are backed up
* Ensure deleted databases are not backed up
* Ensure we don’t backup a database multiple times
I’ve come up with a method to run Ola Hallengren’s famous maintenance scripts in parallel. Using dynamic SQL we can generate a new database list at runtime, and backup just the databases in that list. There’s no need to maintain a list of databases and there’s no chance a database will be missed. We will create multiple SQL Agent jobs that each refer to a separate list of databases. These jobs can then be executed in parallel.
Decide On The Number of Jobs
First step to this process is to decide on the number of parallel jobs you want to create. For this example, I will use 4 parallel jobs. It would be wise to consider how many processors you have on the system and the load incurred by the server during this process.
Dividing Up The Databases
Every database is given a unique integer ID upon creation. This database_id can be found in the sys.databases system table. Using the modulo function, we can divide up the databases based on database_id and the number of jobs we decided upon (in my case, 4). We can view the modulo value via:
SELECT name, database_id, database_id % 4 AS [mod] FROM sys.databases
Unfortunately, Ola Hallengren’s scripts don’t run on database_id, they run on database name. Let’s convert our script to return a list of database names, remove the hard coded number, and make the script easier to update:
DECLARE @NumberOfJobs INT DECLARE @ThisJob INT SET @NumberOfJobs = 4 SET @ThisJob = 0 DECLARE @DATABASES NVARCHAR(MAX) SELECT @DATABASES = COALESCE(@DATABASES + ',', '') + name FROM sys.databases WHERE database_id % @NumberOfJobs = @ThisJob
Notice the new variable @ThisJob. This variable will be different for each SQL Server Agent job you create. Use values 0 through the number of jobs minus one.
Also notice the COALESCE call which includes a comma. This is critical as it will generate a comma separated list of database names which we can then pass into Ola Hallengren using the @DATABASES variable.
Dynamically Call Ola Hallengren
Looking at the default parameters for Ola Hallengren’s DatabaseBackup procedure, we can see that all databases are backed up via the
@Databases = 'USER_DATABASES' parameter value. Let’s convert a DatabaseBackup call to dynamic SQL:
DECLARE @SQL NVARCHAR(MAX) SELECT @SQL = 'EXECUTE [dbo].[DatabaseBackup] @Databases = ''USER_DATABASES'', @Directory = N''C:\SQLBackups'', @BackupType = ''FULL'', @Verify = ''Y'', @CleanupTime = 48, @CheckSum = ''Y'', @LogToTable = ''Y''' --print @SQL EXEC (@SQL)
And now, let’s combine the modulo script and the dynamic SQL script. This will be our final product that we can put directly into a SQL Agent job.
DECLARE @NumberOfJobs INT DECLARE @ThisJob INT SET @NumberOfJobs = 4 SET @ThisJob = 0 DECLARE @DATABASES NVARCHAR(MAX) SELECT @DATABASES = COALESCE(@DATABASES + ',', '') + name FROM sys.databases WHERE database_id % @NumberOfJobs = @ThisJob DECLARE @SQL NVARCHAR(MAX) SELECT @SQL = 'EXECUTE [dbo].[DatabaseBackup] @Databases = ''' + @DATABASES + ''', @Directory = N''C:\SQLBackups'', @BackupType = ''FULL'', @Verify = ''Y'', @CleanupTime = 48, @CheckSum = ''Y'', @LogToTable = ''Y''' --print @SQL EXEC (@SQL)
Pay attention to where the @Databases parameter is set to our modulo generated list of databases. By calling the modulo script everytime, we guarantee that new databases will be backed up and deleted databases will not.
You Must Create Multiple Jobs
To ensure you backup every database on the instance, you must create multiple agent jobs. How many is determined by the value you specified in @NumberOfJobs. If you leave a value out, then all the databases with that database_id modulo result will not be backed up.
In practice, I name my jobs using the default job names with a suffix that corresponds to which modulo result the job uses. I script one of the jobs out, then modify the suffix and the @ThisJob variable. This gives me a tidy set of agent jobs:
There’s definitely room for more automation in this solution. I still need to automate the deployment of all jobs, whichout the need to modify the suffix or @ThisJob. But that’s a simple matter of PowerShell string concatenation. Stay tuned for a future post with that solution.