Recent Tweets

Entries in powershell (17)


Trying Out Jobs in PowerShell

An older app in our workplace stack is a webforms website project and it has a large enough directory structure to take a while to compile the site. I have to run the site a fair amount for a rewrite effort and it changes enough to make the initial build and run painfully slow.

Since I have been working on a build related PowerShell module lately, I thought precompiling the "legacy" site might be a good candidate for a background job. I have not used jobs in PowerShell before and for whatever reason I was having a hard time finding good, complete examples of using them. There were also some things that tripped me up, so here is an example for the future version of me to reference some day.
param (
	$TrunkPath = "D:\Projects\MyApp\trunk"

function Start-WebsitePrecompile
    $logFile = (join-path $TrunkPath "\build\output\logs\BackgroundCompile.log")
    "Build log file is $logFile"
    $msg = @"
Starting background compile of site. Use Get-Job to check progress. 
You may go on about your merry way but may want to leave the host open until complete.
    Write-Output $msg
    $job = Start-Job -InputObject $TrunkPath -Name MyAppPageCompile -ScriptBlock {
        # doesn't appear transcription is supported here
        $trunk = $input

        Set-Alias aspnetcompile $env:windir\Microsoft.NET\Framework\v4.0.30319\aspnet_compiler.exe
        # see website solution file for these values
        $virtualPath = "/web"
        $physicalPath = (join-path $trunk "\web")
        $compilePath = $trunk + "\PrecompiledWeb\web"
        aspnetcompile -v $virtualPath -p $physicalPath -f -errorstack $compilePath        
    # output details
    Register-ObjectEvent $job -MessageData $logFile -EventName StateChanged `
        -SourceIdentifier Compile.JobStateChanged `
        -Action {            
            $logFile = $event.MessageData
            Set-Content -Force -Path $logFile `
            	-Value $(Receive-Job -Id $($Sender.Id) -Keep:$KeepJob | Out-String)
            #$eventSubscriber | UnregisterEvent
            Unregister-event -SourceIdentifier Compile.JobStateChanged
            $eventSubscriber.Action | Remove-Job
            Write-Host "Job # $($sender.Id) ($($sender.Name)) complete. Details at $logFile." 

Some Notes

  • Everything inside the job's script block will be executed in another PowerShell process; anything from outside the script block that needs to be used inside must be passed into the script block with the InputObject parameter ($input). While this might be obvious it does mean potential refactoring considerations.
  • It didn't appear transcription was supported inside the script block which was disappointing.
  • I half expected that Start-Job would provide a parameter for a block to call when the job was complete. Register-ObjectEvent works but it's a bit verbose and isn't even mentioned in many posts talking about job management.
  • Like the job script block, the event handler action script block cannot refer to anything from the outside other than anything passed into the block with the MessageData parameter and automatic variables such as $event, $eventSubscriber, $sender, $sourceEventArgs, and $sourceArgs.
  • I went through some trial and error in getting the output from the job in the completed event. The code on line 36 and 37 worked fine but it was not the most obvious initial syntax.
  • There are a couple of ways to unregister events such as line 38, but I found that when I called the function again I received an error that the event was already subscribed to, so it was clear the unregistration was not working for some reason. The current code is working but similar code did not work previously. I dunno man, gremlins.
  • Event handler cleanup strikes me as a bit odd and this Register-TemporaryEvent script is worth a look.
I am tempted to refactor more of this developer build module process to use more background jobs to do more in parallel. However it is a bit tricky in that many of the functions are called both individually and chained together in driver functions and they need to work both in "foreground" and "background" modes. It would also mean a loss of rich progress reporting and things get more difficult in terms of debugging, output, code sharing, etc. Also, while multiple cores may help with parallel work, there's a law of diminishing returns to be considered as well as machine performance while attempting to do other work while PowerShell crunches away.

PowerShell Activity Progress with Time Estimates

The Goal

Recently I worked on a PowerShell module to do various application build related functions on development machines. One side objective was multi-level activity progress reporting with time estimation so activity duration could be gauged over time. The desire was for this to be done generically and quickly as it was more of a nice-to-have.

In Action

While running this, progress reporting looked something like the below.

Calls to Record Activity Progress

Some low-level functions in this module might be called directly at times and other driver functions might get called to chain together several activities. Either way progress is recorded at each level. Each function simply invokes a call to record the start of an activity as the first step and stopping it as the last.
function New-Build() # params omitted, body simplified for brevity
    Start-Activity "Performing a new build"
	$percentComplete = 0
	Log-Progress "Creating code" -PercentComplete $percentComplete		
	$percentComplete += 33
	Log-Progress "Creating config" -PercentComplete $percentComplete		
	$percentComplete += 33
	Log-Progress "Creating DB" -PercentComplete $percentComplete
	New-DB -common

function New-CodeGen ([bool]$buildStatusUpdate = $true)
    Start-Activity "Running CodeGen with NAnt"
	# ... real work done here ...

# ...

Starting an Activity

First a stack is created at the script level to store the activities. The Start-Activity function creates a new object to store the activity name, start time, total duration and estimated duration (more on that later). This information is pushed onto the stack and output is sent to Write-Progress and Write-Output.
$_activityStack = new-object Collections.Stack

function Start-Activity([string]$activity = $(throw "activity is required"))
    Write-Output "Activity starting: $activity"
    $act = new-object PSObject
    $act | add-member -membertype NoteProperty -name "StartTime" -value $(get-date)
	$act | add-member -membertype NoteProperty -name "Name" -value $activity    
	$act | add-member -membertype NoteProperty -name "TotalSeconds" -Value 0
    $act | add-member -membertype NoteProperty -name "EstimatedSeconds" `
		-Value (Get-ActivityEstimatedSeconds $activity)
    Log-Progress $activity

Stopping an Activity

Stop-Activity will pop the most recent activity off the stack and calculate the duration. It then writes the completion data to progress and output as well as to a stats file that stores the durations by activity name.
function Stop-Activity
    $id = $_activityStack.Count
    $act = $_activityStack.Pop()
    $ts = $(get-date) - $act.StartTime
	$act.TotalSeconds = $ts.TotalSeconds
    $time = ""
    if ($ts.TotalMinutes -ge 1) { $time = "{0:##.00} minute(s)" -f $ts.TotalMinutes }
    else { $time = "{0:##.00} second(s)" -f $ts.TotalSeconds }
	# TODO: add in $act.EstimatedSeconds if > 0
    $status = ("'{0}' complete in {1}" -f $act.Name, $time)
    Write-Progress -Activity $act.Name -Status $status -Completed -Id $id
    Write-Output $status
	Write-Stats $act 

Writing Activity Stats

Write-Stats takes in the activity object and adds it to an array. If the stats (CSV) filename exists it reads it in, sorts the data in descending time order, adds up to $maxKeep (50) existing records into the array, and removes the existing file. The stats filename is then written out with the most recent records.
function Write-Stats ($act = $(throw "activity is required"))
	# initialize an array to hold recent stats for this activity
	$recent += $act
	$statsFile = Get-StatsFilename
	$maxKeep = 50 # across all activities; several different, want a few of each
	if (Test-Path $statsFile)
	 	# get a list of the $maxKeep-1 most recent stats and add each path to the $recent array
		# | Where-Object {$_.Name -eq $act.Name}
	 	Import-CSV $statsFile | Sort StartTime -Descending `
			| Select -Last ($maxKeep -1) | foreach {$recent+=$_}
		# remove the file as we have the data in memory and want to re-write w/top item # and desc time sort
		Remove-Item $statsFile -force
	$recent | select StartTime, Name, TotalSeconds | Export-Csv $statsFile -NoTypeInformation

Getting Activity Estimated Time

Time estimation is done via reading in the CSV file, filtering on the activity name, adding the completion time for each to an array, and averaging those values.
function Get-ActivityEstimatedSeconds([string]$activityName)
    $statsFile = (Get-StatsFilename)
    $avg = -1
    if (Test-Path $statsFile) 
        Import-CSV $statsFile | Where-Object {$_.Name -eq $activityName} `
			| foreach {$totalSeconds+=$_.TotalSeconds}
        $m = $totalSeconds | measure-object -ave
        $avg = $m.Average
    return $avg

Misc. Functions

Log-Progress writes both to standard out and to progress. It retrieves the current activity without removing it from the stack, formats the estimated completion time calculated earlier, and adds that to the progress information. The number of current activities is used as the progress bar id since there will be multiple levels; in my case everything is done serially. Initially I set the estimated seconds argument on Write-Progress but found it misleading; my script shells out to various other apps and that is blocking - it won't update via a timer or anything like that.
function Log-Progress([string]$msg, [int]$percentComplete = 0)
    Write-Output $msg
    $act = $_activityStack.Peek()
    $id = $_activityStack.Count
	# the problem with -SecondsRemaining is it won't auto update w/timer or anything 
	# so it will be helpful at first and then quickly misleading
	$actName = $act.Name
	if ($act.EstimatedSeconds -ge 0)
		$estFinish = $act.StartTime.AddSeconds($act.EstimatedSeconds)
		$actName += " - Est. Finish @ " + $estFinish.ToString("hh:mm:ss tt")
    Write-Progress -Activity $actName -Status $msg -PercentComplete $percentComplete `
        -Id $id #-SecondsRemaining $act.EstimatedSeconds
For simplicity all activity details are stored in the same file, up the max limit defined in Write-Stats.
function Get-StatsFilename
	# considered a separate file per activity but we can filter out of one file
	# too much clutter w/sep and would have to build a safe filename with activity name
	$file = (Join-Path (Get-StatsFilePath) "Stats.csv")
	return $file

function Get-StatsFilePath
	$path = (join-path $env:LOCALAPPDATA "MyCompany\MyApp\build_output\stats")
	if (!(Test-Path $path)) { New-Item $path -type directory | Out-Null }
	return $path

In Conclusion

Other options on timing activities include things like the Stopwatch class or PowerShell's Measure-Command. I'll use those more for ad-hoc measuring here or there. In the case of my module though, I found this "CSV stack" approach to work well as a simple, generic way to time everything across the board. If there are a large number of activity records being kept, deep function chaining, and/or threading work this approach might be a bit more problematic and slow.

Build Automation Part 4: Database and Report Deployments

Series Index

Build Automation Part 1: Overview and Pre-build Tasks
Build Automation Part 2: Building and Packaging
Build Automation Part 3: App Deployment Script
Build Automation Part 4: Database and Report Deployments

Unlike deploying the application itself in its entirety each time, database and report items have required incremental deployments due to their nature and sheer size. The question becomes how to manage these increments in coordination with the application bits to ensure everything is in sync.

I do not pretend to have all the answers here. I was hoping this part of the process could get overhauled more with our last release but there is only so much time in the day.

Some Challenges

Size, Dependencies, Impact Analysis

On the (Oracle) database side of our operational app we have around 800 tables, 500+ packages, 500+ views, 300+ functions and procedures, and a variety of other object types across 3 schemas. Running schema compares / diffs and doing sync scripts has been a time consuming pain in the past, regardless of various tools we have tried.

The situation is complicated by some 10 database environments, other app databases our app is dependent on, various apps depending on our database, and dynamic SQL embedded into report RDL files or coming from ORM's like Entity Framework. Dependency and impact analysis can be difficult, particularly across systems.

On the report side this app has over 600 SSRS reports scattered over 8 servers, several folders and different versions of SQL Server Reporting Services.

Source Control

Source control for reports has not been a problem so much with using SSRS, other than TFS generating a lot of unnecessary merge conflicts on those XML RDL files.

On the database side we have had some success using Team Coding in Toad with the TFS MSSCCI Provider. Quest came out with a native TFS provider but it did not support TFS work item association which ruled out our use of it.

The MSSCCI provider "works" with Toad for basic changes like packages, procedures, functions, triggers, views, etc. but boy it is not without a lot of quirks. So much so that for a while I saved any database object script changes to files on my desktop out of fear my changes would get overwritten, which used to happen quite a bit. The other problem is that not all changes are source controlled such as table schema changes, data changes etc. We offset that via some SQL file attachments to tasks tied to a project.

Pulling Report and Database Changes

I posted before about a custom TFS Artifact Manager (hereinafter "TAM") tool for pulling down report and database changeset files and task attachments either for a single work item or a large batch of related work items, such as all tasks linked through a series of scenarios, backlog items and bugs tied to a project. I won't repeat those details here but we currently still use the tool to gather up all the report and database changes for a given project release. It far from perfect but it beats manual guesswork and building sync scripts from database compares.

The TAM tool is also used to pull artifacts from various single task ad-hoc changes made outside of any official product release. Many reports can be added to the application dynamically through some dynamic report forms with common parameters; new records are added to the database and the reports are deployed outside of any app deployment. Likewise there are occasional database changes made independent of the application.

There are other tools mentioned in the TAM post that we may try using more in the future. Also, Troy Hunt has a good series of deployment posts including this one regarding Automated database releases with TeamCity and Red Gate. Doing exactly that with our schemas would make me a bit nervous but perhaps with tweaks and some experiments in the future.

Additionally I posted a review on Red Gate's Schema Compare for Oracle which can be quite a useful tool. We don't rely on it as heavily anymore with custom tools and processes but it is handy to use it to double-check things after deployments or for doing more one-off database syncs.

Deploying Reports

Ad hoc Report Deployments

When deploying from Development to Test, our CI process picks up any reports merged from $/MyApp/Reports/Dev/ to $/MyApp/Reports/Main/ and automatically deploys those via PowerShell and RS Utility. Any corresponding database changes are manually applied before that. Environments beyond Test currently require attaching report and database changes to a Help Desk ticket and routing to a DBA. The DBA runs any database changes and deploys the reports using the script mentioned in the next section.

App Report Deployments

For reports to be deployed with a given build of an application, the process is basically the same as the ad-hoc process in respect to going from Dev to Test. One difference is on timing of merging the report changes in source control to correspond with any dependent changes to the application code. When moving beyond the Test environment, all reports tied to a given project work item are pulled using the TAM tool. They are then deployed in mass using PowerShell and the SSRS web service (without RS Utility), in a manner similar to this post on Uploading SSRS Reports with PowerShell.

Deploying Database Changes

We currently do not have much automation around ad hoc database deployments but not much is needed there.

For app database changes we start by pulling the database changes using the TAM tool. In that tool a script "package" is built by choosing the scripts to include and specifying any needed order or execution. Previously someone (dev or DBA depending on environment) would either execute all those manually by hand in Toad, or build out an index/driver script and run that. It was not as bad as it might sound, given the tool produced combined SQL scripts for views, packages, procs, etc. Still it was tedious if there were a number of data or schema migration scripts to be run in order.

Executing Database Scripts With PowerShell

Our resident Oracle expert Jim Tilson ("The Optimizer") had the idea of creating a PowerShell script that used SQL*Plus to generically execute all database scripts in a given directory. I paired with him to get the basic interaction going but this is his brainchild and work. He should probably be the one explaining this but no telling when that slacker will get around to blogging :). If you have any interest in Oracle, SQL optimization, database tech in general, or Ruby, you should reach out to him on Twitter and ask him to blog more (and tweet more while he is at it). At any rate this might be useful for others so I will post the code and attempt to explain it.


The script expects all the database script files to be located in subfolders where the script resides, one folder per schema name, and no subfolders with each schema folder (not recursive).

Each file in a schema folder will be executed regardless of filename extension. Ordering is based on filename; our TAM tool prefixes a numeric wart on each file to ensure an obvious order. At the moment the script does not explicitly specify a name ordering but that's the default.

Executing Scripts for Each Schema

At the bottom of the script, the server TNS name is prompted for and a couple things are set before running the main run-scripts function.
set-location (Get-ScriptDirectory)
$dateWart = Get-DateWart
$server = read-host "Enter the server TNS name"
echo "Successfully ran all scripts."
Run-Scripts invokes a function to run the scripts for each schema, passing along the user and server info. This could be made more generic by assuming any subfolder where the PowerShell script resides represents a schema with database scripts to run.
function Run-Scripts
    foreach ($user in @("USER_SCHEMA_1", "USER_SCHEMA_2", "USER_SCHEMA_3")) 
		run-scriptsforschema -server $server -user $user

Running Scripts for a Schema

This function will temporarily set location to the schema subfolder corresponding to the user/schema name passed in. It prompts for a password for a later connection to that schema. Finally it enumerates all files in the schema folder, calls a Run-Script function to execute each, and writes progress as it goes.

Each successfully executed file is moved into a Completed subfolder. That file move is important as many migration scripts are written assuming they will only be run once and we have been bit by DBA's accidentally running scripts more than once.
function Run-ScriptsForSchema($user)
    echo "Running scripts for $user."
    set-location $user    
	$password = get-password($user)    

    $files = @(get-childitem | where {!$_.PsIsContainer})
	$count = 0
    foreach ($fileInfo in $files)
        write-progress -activity "Running scripts for $user" `
			-currentoperation $ -status ("Executing") `
			-PercentComplete (100*$count/$files.count)
        Run-Script $user $password $fileInfo
        write-progress -activity "Running scripts for $user" `
			-currentoperation $ -status ("Done") `
			-PercentComplete (100*$count/$files.count)
        move-item -path $fileInfo.fullname -destination ".\Completed" -Force
    write-progress -activity "Running scripts for $user" -status ("Complete") -completed
    echo "Completed scripts for $user"

Running a Script

The Run-Script function takes care of some logging and calls a normalize function to tack on additional SQL before and after the SQL contained in the file (error handling, commits, etc.); more on that in a moment. Function notes follow.
  • Path - the sqlplus location should be in the SYSTEM PATH environment variable so fully qualifying it should not be needed. In my case the location is C:\app\[username]\product\11.2.0\client_1\.
  • -L Parameter - instructs the app to only attempt to log on once; otherwise w/bad credentials it can get hung awaiting input.
  • -M Parameter - indicates HTML output is desired from sqlplus.
  • Credentials - The server name captured earlier is passed in along with the user/schema and password parameter values.
  • SQL - the normalize script function returns a SQL string and that is piped into sqlplus to be executed.
  • Output - the output is sent to $logFile and the 2>$1 sends standard error to standard output.
  • Error checking - Finally $LASTEXITCODE is checked to see what sqlplus.exe exited with; if 0 it was successful, otherwise it is the Oracle error number. The process stops on any error; manual changes might be needed to address any problems then the script can be run again.
function Run-Script($user, $password, $fileInfo)
    $logDir = ("..\Logs\{0}\{1}" -f $dateWart, $user)
    ensure-directory $logDir
    $logFile = join-path $logDir ($fileInfo.basename + ".html")
    (normalize-script $fileinfo.fullname) | sqlplus.exe -L -M "HTML ON SPOOL ON" `
		-S "$user/""$password""@$server" >> $logfile 2>$1

    $lec = $LASTEXITCODE
    if ($lec -ne 0)
        write-error ("ERROR executing {0}!" -f $fileInfo.FullName)

Adjusting the SQL

There are two critical adjustments made to the SQL read from the database script files to execute. The first is detecting a SQL error and exiting SQL*PLus with the error code. The other is issuing a commit at the end; most of our data related scripts do not include a commit as often they are ran and verified before issuing a commit. It is worth reading over the WHENEVER SQLERROR documentation as some types of errors will not trigger an exit; fully checking for all types of errors might require something more brittle like scanning the log files for certain error phrases.
function normalize-script($filename)
    whenever sqlerror exit sql.sqlcode
    set echo off
    set termout off
    $([string]::join("`n", (get-content $fileinfo.fullname -readcount 0)))

Helper Functions

At the top of the script are some helper functions and an interop services dll is loaded for later use in translating the secure password to a plain text string to be passed along to SQL*Plus.

function Get-ScriptDirectory
    Split-Path ((Get-Variable MyInvocation -scope script).Value.MyCommand.Path)

function ensure-directory($dir)
    if (!(test-path $dir)) { new-item $dir -type directory }

function Get-DateWart()
    get-date -uformat "%Y %m %d %H %M %S"

function get-password($user)
    $enterpassword = read-host -AsSecureString "Password for $user@$server"

Analyzing the Results

The script stores logs under a Logs\[Timestamp]\Schema\ folder for troubleshooting and verification purposes.

SQL*Plus Alternatives

One alternative to SQL*Plus is using OracleCommand's ExecuteNonQuery method in Oracle.DataAccess.dll. I tried this approach back when I created an OracleScriptExecutor utility app that was designed to easily run SQL Scripts against multiple schemas. It was a bit of a nightmare that I do not recommend. For one you have to deal with annoyances like linefeed issues, semicolon and BEGIN/END block issues, and it is quite difficult to deal with multiple scripts combined in one SQL file (i.e. '/' delimited). It almost requires a full blown SQL parsing engine to handle it correctly so I'd rather delegate that pain to a tool like SQL*Plus that already handles such complexity.

In Conclusion

This really only scratches the surface of the problems and solutions in dealing with database and report deployments. With some time and TLC I am sure this beast could be further tamed. Thoughts, suggestions, tips, helpful tools, processes? Leave a comment!

I am running out of time in this series but hopefully I can touch on some CI details with TeamCity next.