Using F# Asynchronous Workflows to simplify concurrent programming in AutoCAD

In the last post we saw some code that downloaded data - serially - from a number of websites via RSS and created AutoCAD entities linking to the various posts.

As promised, in today's post we take that code and enable it to query the same data in parallel by using Asynchronous Workflows in F#. Asynchronous Workflows are an easy-to-use yet powerful mechanism for enabling concurrent programming in F#.

Firstly, a little background as to why this type of technique is important. As many - if not all - of you are aware, the days of raw processor speed doubling every couple of years are over. The technical innovations that enabled Moore's Law to hold true for half a century are - at least in the area of silicon-based microprocessor design - hitting a wall (it's apparently called the Laws of Physics :-). Barring some disruptive technological development, the future gains in computing performance are to be found in the use of parallel processing, whether via multiple cores, processors or distributed clouds of computing resources.

Additionally, with an increasing focus on distributed computing and information resources, managing tasks asynchronously becomes more important, as information requests across a network inevitably introduce a latency that can be mitigated by the tasks being run in parallel.

The big problem is that concurrent programming is - for the most-part - extremely difficult to do, and even harder to retro-fit into existing applications. Traditional lock-based parallelism (where locks are used to control access to shared computing resources) is both unwieldy and prone to blocking. New technologies, such as Asynchronous Workflows and Software Transactional Memory, provide considerable hope (and this is a topic I have on my list to cover at some future point).

Today's post looks at a relatively simple scenario, in the sense that we want to perform a set of discrete tasks in parallel, harnessing those fancy multi-core systems for those of you lucky enough to have them (I'm hoping to get one when I next replace my notebook, sometime in March), but that these tasks are indeed independent: we want to wait until they are all complete, but we do not have the additional burden of them communicating amongst themselves or using shared resources (e.g. accessing shared memory) during their execution.

We are also going to be very careful only to run parallel tasks unrelated to AutoCAD. Any access made into AutoCAD's database, for instance, needs to be performed in series: AutoCAD is not thread-safe when it comes to the vast majority of its programmatically-accessible functionality. So we're going to run a set of asynchronous, parallel tasks to query our various RSS feeds, and combine the results before creating the corresponding geometry in AutoCAD. This all sounds very complex, but the good (actually great) news is that Asynchronous Workflows does all the heavy lifting. Phew.

Here's the modified F# code, with the modified/new lines coloured in red:

    1 // Use lightweight F# syntax

    2

    3 #light

    4

    5 // Declare a specific namespace and module name

    6

    7 module MyNamespace.MyApplication

    8

    9 // Import managed assemblies

   10

   11 #I @"C:\Program Files\Autodesk\AutoCAD 2008"

   12

   13 #r "acdbmgd.dll"

   14 #r "acmgd.dll"

   15

   16 open Autodesk.AutoCAD.Runtime

   17 open Autodesk.AutoCAD.ApplicationServices

   18 open Autodesk.AutoCAD.DatabaseServices

   19 open Autodesk.AutoCAD.Geometry

   20 open System.Xml

   21 open System.Collections

   22 open System.Collections.Generic

   23 open System.IO

   24 open System.Net

   25 open Microsoft.FSharp.Control.CommonExtensions

   26

   27 // The RSS feeds we wish to get. The first two values are

   28 // only used if our code is not able to parse the feed's XML

   29

   30 let feeds =

   31   [ ("Through the Interface",

   32     "http://blogs.autodesk.com/through-the-interface",

   33     "http://through-the-interface.typepad.com/through_the_interface/rss.xml");

   34

   35     ("Don Syme's F# blog",

   36     "http://blogs.msdn.com/dsyme/",

   37     "http://blogs.msdn.com/dsyme/rss.xml");

   38

   39     ("Shaan Hurley's Between the Lines",

   40     "http://autodesk.blogs.com/between_the_lines",

   41     "http://autodesk.blogs.com/between_the_lines/rss.xml");

   42

   43     ("Scott Sheppard's It's Alive in the Lab",

   44     "http://blogs.autodesk.com/labs",

   45     "http://labs.blogs.com/its_alive_in_the_lab/rss.xml");

   46

   47     ("Lynn Allen's Blog",

   48     "http://blogs.autodesk.com/lynn",

   49     "http://lynn.blogs.com/lynn_allens_blog/index.rdf");

   50

   51     ("Heidi Hewett's AutoCAD Insider",

   52     "http://blogs.autodesk.com/autocadinsider",

   53     "http://heidihewett.blogs.com/my_weblog/index.rdf") ]

   54

   55 // Fetch the contents of a web page, asynchronously

   56

   57 let httpAsync(url:string) =

   58   async { let req = WebRequest.Create(url)

   59           use! resp = req.GetResponseAsync()

   60           use stream = resp.GetResponseStream()

   61           use reader = new StreamReader(stream)

   62           return reader.ReadToEnd() }

   63

   64 // Load an RSS feed's contents into an XML document object

   65 // and use it to extract the titles and their links

   66 // Hopefully these always match (this could be coded more

   67 // defensively)

   68

   69 let titlesAndLinks (name, url, xml) =

   70   let xdoc = new XmlDocument()

   71   xdoc.LoadXml(xml)

   72

   73   let titles =

   74     [ for n in xdoc.SelectNodes("//*[name()='title']")

   75         -> n.InnerText ]

   76   let links =

   77     [ for n in xdoc.SelectNodes("//*[name()='link']") ->

   78         let inn = n.InnerText

   79         if  inn.Length > 0 then

   80           inn

   81         else

   82           let href = n.Attributes.GetNamedItem("href").Value

   83           let rel = n.Attributes.GetNamedItem("rel").Value

   84           if href.Contains("feedburner") then

   85               ""

   86           else

   87             href ]

   88

   89   let descs =

   90     [ for n in xdoc.SelectNodes

   91         ("//*[name()='description' or name()='content' or name()='subtitle']")

   92           -> n.InnerText ]

   93

   94   // A local function to filter out duplicate entries in

   95   // a list, maintaining their current order.

   96   // Another way would be to use:

   97   //    Set.of_list lst |> Set.to_list

   98   // but that results in a sorted (probably reordered) list.

   99

  100   let rec nub lst =

  101     match lst with

  102     | a::[] -> [a]

  103     | a::b ->

  104       if a = List.hd b then

  105         nub b

  106       else

  107         a::nub b

  108     | [] -> []

  109

  110   // Filter the links to get (hopefully) the same number

  111   // and order as the titles and descriptions

  112

  113   let real = List.filter (fun (x:string) -> x.Length > 0) 

  114   let lnks = real links |> nub

  115

  116   // Return a link to the overall blog, if we don't have

  117   // the same numbers of titles, links and descriptions

  118

  119   let lnum = List.length lnks

  120   let tnum = List.length titles

  121   let dnum = List.length descs

  122

  123   if tnum = 0 || lnum = 0 || lnum <> tnum || dnum <> tnum then

  124     [(name,url,url)]

  125   else

  126     List.zip3 titles lnks descs

  127

  128 // For a particular (name,url) pair,

  129 // create an AutoCAD HyperLink object

  130

  131 let hyperlink (name,url,desc) =

  132   let hl = new HyperLink()

  133   hl.Name <- url

  134   hl.Description <- desc

  135   (name, hl)

  136

  137 // Use asynchronous workflows in F# to download

  138 // an RSS feed and return AutoCAD HyperLinks

  139 // corresponding to its posts

  140

  141 let hyperlinksAsync (name, url, feed) =

  142   async { let! xml = httpAsync feed

  143           let tl = titlesAndLinks (name, url, xml)

  144           return List.map hyperlink tl }

  145

  146 // Now we declare our command

  147

  148 [<CommandMethod("rss")>]

  149 let createHyperlinksFromRss() =

  150

  151   // Let's get the usual helpful AutoCAD objects

  152

  153   let doc =

  154     Application.DocumentManager.MdiActiveDocument

  155   let db = doc.Database

  156

  157   // "use" has the same effect as "using" in C#

  158

  159   use tr =

  160     db.TransactionManager.StartTransaction()

  161

  162   // Get appropriately-typed BlockTable and BTRs

  163

  164   let bt =

  165     tr.GetObject

  166       (db.BlockTableId,OpenMode.ForRead)

  167     :?> BlockTable

  168   let ms =

  169     tr.GetObject

  170       (bt.[BlockTableRecord.ModelSpace],

  171       OpenMode.ForWrite)

  172     :?> BlockTableRecord

  173

  174   // Add text objects linking to the provided list of

  175   // HyperLinks, starting at the specified location

  176

  177   // Note the valid use of tr and ms, as they are in scope

  178

  179   let addTextObjects pt lst =

  180     // Use a for loop, as we care about the index to

  181     // position the various text items

  182

  183     let len = List.length lst

  184     for index = 0 to len - 1 do

  185       let txt = new DBText()

  186       let (name:string,hl:HyperLink) = List.nth lst index

  187       txt.TextString <- name

  188       let offset =

  189         if index = 0 then

  190           0.0

  191         else

  192           1.0

  193

  194       // This is where you can adjust:

  195       //  the initial outdent (x value)

  196       //  and the line spacing (y value)

  197

  198       let vec =

  199         new Vector3d

  200           (1.0 * offset,

  201           -0.5 * (Int32.to_float index),

  202           0.0)

  203       let pt2 = pt + vec

  204       txt.Position <- pt2

  205       ms.AppendEntity(txt) |> ignore

  206       tr.AddNewlyCreatedDBObject(txt,true)

  207       txt.Hyperlinks.Add(hl) |> ignore

  208

  209   // Here's where we do the real work, by firing

  210   // off - and coordinating - asynchronous tasks

  211   // to create HyperLink objects for all our posts

  212

  213   let links =

  214     Async.Run

  215       (Async.Parallel

  216         [ for (name,url,feed) in feeds ->

  217           hyperlinksAsync (name,url,feed) ])

  218

  219   // Add the resulting objects to the model-space 

  220

  221   let len = Array.length links

  222   for index = 0 to len - 1 do

  223

  224     // This is where you can adjust:

  225     //  the column spacing (x value)

  226     //  the vertical offset from origin (y axis)

  227

  228     let pt =

  229       new Point3d

  230         (15.0 * (Int32.to_float index),

  231         30.0,

  232         0.0)

  233     addTextObjects pt (Array.get links index)

  234

  235   tr.Commit()

You can download the new F# source file from here.

A few comments on the changes:

Lines 57-62 define our new httpAsync() function, which uses GetResponseAsync() - a function exposed in F# 1.9.3.7 - to download the contents of a web-page asynchronously [and which I stole shamelessly from Don Syme, who presented the code last summer at Microsoft's TechEd].

Lines 141-144 define another asynchronous function, hyperlinksAsync(), which calls httpAsync() and then - as before - extracts the feed information and creates a corresponding list of HyperLinks. This is significant: creation of AutoCAD HyperLink objects will be done on parallel; it is the addition of these objects to the drawing database that needs to be performed serially.

Lines 214-217 replace our very simple "map" with something slightly more complex: this code runs a list of tasks in parallel and waits for them all to complete before continuing. What is especially cool about this implementation is the fact that exceptions in individual tasks result in the overall task failing (a good thing, believe it or not :-), and the remaining tasks being terminated gracefully.

Lines 221 and 233 change our code to handle an array, rather than a list (while "map" previously returned a list, Async.Run returns an array).

When run, the code creates exactly the same thing as last time (although there are a few more posts in some of the blogs ;-)

Autocad_does_rss_3

A quick word on timing: I used "F# Interactive" to do a little benchmarking on my system, and even though it's single-core, single-processor, there was a considerable difference between the two implementations. I'll talk more about F# Interactive at some point, but think of it to F# in Visual Studio as the command-line is to LISP in AutoCAD: you can very easily test out fragments of F#, either by entering them directly into the F# Interactive window or highlighting them in Visual Studio's text editor and hitting Alt-Enter.

To enable function timing I entered "#time;;" (without the quotations marks) in the F# Interactive window. I then selected and loaded the supporting functions needed for each test - not including the code that adds the DBText objects with their HyperLinks to the database, as we're only in Visual Studio, not inside AutoCAD - and executed the "let links = ..." assignment in our two implementations of the createHyperlinksFromRss() function (i.e. the RSS command). These functions do create lists of AutoCAD HyperLinks, but that's OK: this is something works even outside AutoCAD, although we wouldn't be able to do anything much with them. Also, the fact we're not including the addition of the entities to the AutoCAD database is not relevant: by then we should have identical data in both versions, which would be added in exactly the same way.

Here are the results:

I executed the code for serial querying and parallel querying twice (to make sure there were no effects from page caching on the measurement):

val links : (string * HyperLink) list list

Real: 00:00:14.636, CPU: 00:00:00.15, GC gen0: 5, gen1: 1, gen2: 0

val links : (string * HyperLink) list array

Real: 00:00:06.245, CPU: 00:00:00.31, GC gen0: 3, gen1: 0, gen2: 0

val links : (string * HyperLink) list list

Real: 00:00:15.45, CPU: 00:00:00.46, GC gen0: 5, gen1: 1, gen2: 0

val links : (string * HyperLink) list array

Real: 00:00:03.832, CPU: 00:00:00.62, GC gen0: 2, gen1: 1, gen2: 0

So the serial execution took 14.5 to 15.5 seconds, while the parallel execution took 3.8 to 6.3 seconds.

January 25, 2008 in AutoCAD, AutoCAD .NET, Concurrent programming, F#, Weblogs | Permalink | Comments (1) | TrackBack

Turning AutoCAD into an RSS reader with F#

OK, OK, you are probably thinking "why would anyone ever want to use AutoCAD as an RSS reader?". The answer is, of course, "they wouldn't". The point of the next few posts is not actually to enable AutoCAD to be used to read RSS, but to show how it is possible to use F# and .NET to extract information from RSS feeds and create corresponding AutoCAD entities.

The reason I came onto this subject will also become more clear when you see my next post: I have been researching Asynchronous Workflows in F# - an uber-cool mechanism for managing concurrent, asynchronous tasks - and this seemed like a valid place to start. The problem I was looking for was one where I could simultaneously query and manipulate data from multiple sources, and then use that data to create AutoCAD entities. So, ultimately, the choice of RSS was both logical and completely irrelevant. :-)

Today I'm going to present code that works synchronously: in a single thread we are going to query website after website to download individual RSS feeds and to process them, extracting information on the various posts listed in the RSS, and create HyperLink objects in AutoCAD attached to DBText entities. These will be laid out such that - if you really, really wanted to - you could use these entities to open the various posts in your internet browser.

The reason I chose F# was really the ability to succinctly launch and coordinate asynchronous tasks - something you'll see in the next post, of course. While I could have used C# or VB.NET, F# is also well suited to dealing with lists of data - such as we'll be extracting from the various RSS feeds.

I used F# 1.9.3.7 to run this code: you will certainly need this version to run the code in the following post, as the asynchronous HTTP request functionality is new to the 1.9.3.7 release.

A few additional notes on the implementation... The below code somehow manages to support various RSS standards: Atom, RSS 1.0, RSS 2.0. But some of it feels like a bit of a "hack". The code queries for the titles, links and descriptions contained in each feed, and does some programmatic manipulation to end up - in the cases I've tested - with equal numbers of each. Feeds that use Feedburner, for instance, contain various types of link, which made this very tricky, but the below code appears to work for most cases. The point of this exercise is not to implement an "all singing, all dancing" implementation for RSS consumption: I simply did what was needed to get a number of different blogs working. If a particular feed you add doesn't work, you will just get a single entry created inside AutoCAD. Please don't expect me to debug why it doesn't work for that feed, as that was never the point of this exercise (and I wasted far too long getting to this point, believe me :-).

Here's the F# code:

// Use lightweight F# syntax


#light


// Declare a specific namespace and module name


module MyNamespace.MyApplication


// Import managed assemblies


#I @"C:\Program Files\Autodesk\AutoCAD 2008"


#r "acdbmgd.dll"

#r "acmgd.dll"


open Autodesk.AutoCAD.Runtime

open Autodesk.AutoCAD.ApplicationServices

open Autodesk.AutoCAD.DatabaseServices

open Autodesk.AutoCAD.Geometry

open System.Xml

open System.Collections

open System.Collections.Generic

open System.IO

open System.Net

open Microsoft.FSharp.Control.CommonExtensions


// The RSS feeds we wish to get. The first two values are

// only used if our code is not able to parse the feed's XML


let feeds =

  [ ("Through the Interface",

    "http://blogs.autodesk.com/through-the-interface",

    "http://through-the-interface.typepad.com/through_the_interface/rss.xml");


    ("Don Syme's F# blog",

    "http://blogs.msdn.com/dsyme/",

    "http://blogs.msdn.com/dsyme/rss.xml");


    ("Shaan Hurley's Between the Lines",

    "http://autodesk.blogs.com/between_the_lines",

    "http://autodesk.blogs.com/between_the_lines/rss.xml");


    ("Scott Sheppard's It's Alive in the Lab",

    "http://blogs.autodesk.com/labs",

    "http://labs.blogs.com/its_alive_in_the_lab/rss.xml");


    ("Lynn Allen's Blog",

    "http://blogs.autodesk.com/lynn",

    "http://lynn.blogs.com/lynn_allens_blog/index.rdf");


    ("Heidi Hewett's AutoCAD Insider",

    "http://blogs.autodesk.com/autocadinsider",

    "http://heidihewett.blogs.com/my_weblog/index.rdf") ]


// Fetch the contents of a web page, synchronously


let httpSync (url:string) =

  let req = WebRequest.Create(url)

  use resp = req.GetResponse()

  use stream = resp.GetResponseStream()

  use reader = new StreamReader(stream)

  reader.ReadToEnd()


// Load an RSS feed's contents into an XML document object

// and use it to extract the titles and their links

// Hopefully these always match (this could be coded more

// defensively)


let titlesAndLinks (name, url, xml) =

  let xdoc = new XmlDocument()

  xdoc.LoadXml(xml)


  let titles =

    [ for n in xdoc.SelectNodes("//*[name()='title']")

        -> n.InnerText ]

  let links =

    [ for n in xdoc.SelectNodes("//*[name()='link']") ->

        let inn = n.InnerText

        if  inn.Length > 0 then

          inn

        else

          let href = n.Attributes.GetNamedItem("href").Value

          let rel = n.Attributes.GetNamedItem("rel").Value

          if href.Contains("feedburner") then

              ""

          else

            href ]


  let descs =

    [ for n in xdoc.SelectNodes

        ("//*[name()='description' or name()='content' or name()='subtitle']")

          -> n.InnerText ]


  // A local function to filter out duplicate entries in

  // a list, maintaining their current order.

  // Another way would be to use:

  //    Set.of_list lst |> Set.to_list

  // but that results in a sorted (probably reordered) list.


  let rec nub lst =

    match lst with

    | a::[] -> [a]

    | a::b ->

      if a = List.hd b then

        nub b

      else

        a::nub b

    | [] -> []


  // Filter the links to get (hopefully) the same number

  // and order as the titles and descriptions


  let real = List.filter (fun (x:string) -> x.Length > 0) 

  let lnks = real links |> nub


  // Return a link to the overall blog, if we don't have

  // the same numbers of titles, links and descriptions


  let lnum = List.length lnks

  let tnum = List.length titles

  let dnum = List.length descs


  if tnum = 0 || lnum = 0 || lnum <> tnum || dnum <> tnum then

    [(name,url,url)]

  else

    List.zip3 titles lnks descs


// For a particular (name,url) pair,

// create an AutoCAD HyperLink object


let hyperlink (name,url,desc) =

  let hl = new HyperLink()

  hl.Name <- url

  hl.Description <- desc

  (name, hl)


// Download an RSS feed and return AutoCAD HyperLinks for its posts


let hyperlinksSync (name, url, feed) =

  let xml = httpSync feed

  let tl = titlesAndLinks (name, url, xml)

  List.map hyperlink tl


// Now we declare our command


[<CommandMethod("rss")>]

let createHyperlinksFromRss() =


  // Let's get the usual helpful AutoCAD objects


  let doc =

    Application.DocumentManager.MdiActiveDocument

  let db = doc.Database


  // "use" has the same effect as "using" in C#


  use tr =

    db.TransactionManager.StartTransaction()


  // Get appropriately-typed BlockTable and BTRs


  let bt =

    tr.GetObject

      (db.BlockTableId,OpenMode.ForRead)

    :?> BlockTable

  let ms =

    tr.GetObject

      (bt.[BlockTableRecord.ModelSpace],

      OpenMode.ForWrite)

    :?> BlockTableRecord


  // Add text objects linking to the provided list of

  // HyperLinks, starting at the specified location


  // Note the valid use of tr and ms, as they are in scope


  let addTextObjects pt lst =

    // Use a for loop, as we care about the index to

    // position the various text items


    let len = List.length lst

    for index = 0 to len - 1 do

      let txt = new DBText()

      let (name:string,hl:HyperLink) = List.nth lst index

      txt.TextString <- name

      let offset =

        if index = 0 then

          0.0

        else

          1.0


      // This is where you can adjust:

      //  the initial outdent (x value)

      //  and the line spacing (y value)


      let vec =

        new Vector3d

          (1.0 * offset,

          -0.5 * (Int32.to_float index),

          0.0)

      let pt2 = pt + vec

      txt.Position <- pt2

      ms.AppendEntity(txt) |> ignore

      tr.AddNewlyCreatedDBObject(txt,true)

      txt.Hyperlinks.Add(hl) |> ignore


  // Here's where we use the varous functions

  // we've defined


  let links =

    List.map hyperlinksSync feeds


  // Add the resulting objects to the model-space 


  let len = List.length links

  for index = 0 to len - 1 do


    // This is where you can adjust:

    //  the column spacing (x value)

    //  the vertical offset from origin (y axis)


    let pt =

      new Point3d

        (15.0 * (Int32.to_float index),

        30.0,

        0.0)

    addTextObjects pt (List.nth links index)


  tr.Commit()

Here's a portion of what gets created when you run the "rss" command:

AutoCAD does RSS   

That's it for today - in the next post we'll look at how to use asynchronous workflows to run RSS extraction tasks in parallel.

January 23, 2008 in AutoCAD, AutoCAD .NET, F#, Weblogs | Permalink | Comments (3) | TrackBack

New Autodesk Labs blog

Scott Sheppard - the original author of the very informative DWF blog, Beyond the Paper - has launched a new blog talking about Autodesk Labs, called It's Alive in the Lab.

I'm looking forward to seeing lots of cool information on Labs technology coming through this blog.

April 12, 2007 in Weblogs | Permalink | Comments (1) | TrackBack