Kean Walmsley

July 2009

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

Twitter Updates

    follow me on Twitter



    « Turning AutoCAD into an RSS reader with F# | Main | Using F# to simulate hardware behaviour »

    January 25, 2008

    Using F# Asynchronous Workflows to simplify concurrent programming in AutoCAD

    In the last post we saw some code that downloaded data - serially - from a number of websites via RSS and created AutoCAD entities linking to the various posts.

    As promised, in today's post we take that code and enable it to query the same data in parallel by using Asynchronous Workflows in F#. Asynchronous Workflows are an easy-to-use yet powerful mechanism for enabling concurrent programming in F#.

    Firstly, a little background as to why this type of technique is important. As many - if not all - of you are aware, the days of raw processor speed doubling every couple of years are over. The technical innovations that enabled Moore's Law to hold true for half a century are - at least in the area of silicon-based microprocessor design - hitting a wall (it's apparently called the Laws of Physics :-). Barring some disruptive technological development, the future gains in computing performance are to be found in the use of parallel processing, whether via multiple cores, processors or distributed clouds of computing resources.

    Additionally, with an increasing focus on distributed computing and information resources, managing tasks asynchronously becomes more important, as information requests across a network inevitably introduce a latency that can be mitigated by the tasks being run in parallel.

    The big problem is that concurrent programming is - for the most-part - extremely difficult to do, and even harder to retro-fit into existing applications. Traditional lock-based parallelism (where locks are used to control access to shared computing resources) is both unwieldy and prone to blocking. New technologies, such as Asynchronous Workflows and Software Transactional Memory, provide considerable hope (and this is a topic I have on my list to cover at some future point).

    Today's post looks at a relatively simple scenario, in the sense that we want to perform a set of discrete tasks in parallel, harnessing those fancy multi-core systems for those of you lucky enough to have them (I'm hoping to get one when I next replace my notebook, sometime in March), but that these tasks are indeed independent: we want to wait until they are all complete, but we do not have the additional burden of them communicating amongst themselves or using shared resources (e.g. accessing shared memory) during their execution.

    We are also going to be very careful only to run parallel tasks unrelated to AutoCAD. Any access made into AutoCAD's database, for instance, needs to be performed in series: AutoCAD is not thread-safe when it comes to the vast majority of its programmatically-accessible functionality. So we're going to run a set of asynchronous, parallel tasks to query our various RSS feeds, and combine the results before creating the corresponding geometry in AutoCAD. This all sounds very complex, but the good (actually great) news is that Asynchronous Workflows does all the heavy lifting. Phew.

    Here's the modified F# code, with the modified/new lines coloured in red:

        1 // Use lightweight F# syntax

        2

        3 #light

        4

        5 // Declare a specific namespace and module name

        6

        7 module MyNamespace.MyApplication

        8

        9 // Import managed assemblies

       10

       11 #I @"C:\Program Files\Autodesk\AutoCAD 2008"

       12

       13 #r "acdbmgd.dll"

       14 #r "acmgd.dll"

       15

       16 open Autodesk.AutoCAD.Runtime

       17 open Autodesk.AutoCAD.ApplicationServices

       18 open Autodesk.AutoCAD.DatabaseServices

       19 open Autodesk.AutoCAD.Geometry

       20 open System.Xml

       21 open System.Collections

       22 open System.Collections.Generic

       23 open System.IO

       24 open System.Net

       25 open Microsoft.FSharp.Control.CommonExtensions

       26

       27 // The RSS feeds we wish to get. The first two values are

       28 // only used if our code is not able to parse the feed's XML

       29

       30 let feeds =

       31   [ ("Through the Interface",

       32     "http://blogs.autodesk.com/through-the-interface",

       33     "http://through-the-interface.typepad.com/through_the_interface/rss.xml");

       34

       35     ("Don Syme's F# blog",

       36     "http://blogs.msdn.com/dsyme/",

       37     "http://blogs.msdn.com/dsyme/rss.xml");

       38

       39     ("Shaan Hurley's Between the Lines",

       40     "http://autodesk.blogs.com/between_the_lines",

       41     "http://autodesk.blogs.com/between_the_lines/rss.xml");

       42

       43     ("Scott Sheppard's It's Alive in the Lab",

       44     "http://blogs.autodesk.com/labs",

       45     "http://labs.blogs.com/its_alive_in_the_lab/rss.xml");

       46

       47     ("Lynn Allen's Blog",

       48     "http://blogs.autodesk.com/lynn",

       49     "http://lynn.blogs.com/lynn_allens_blog/index.rdf");

       50

       51     ("Heidi Hewett's AutoCAD Insider",

       52     "http://blogs.autodesk.com/autocadinsider",

       53     "http://heidihewett.blogs.com/my_weblog/index.rdf") ]

       54

       55 // Fetch the contents of a web page, asynchronously

       56

       57 let httpAsync(url:string) =

       58   async { let req = WebRequest.Create(url)

       59           use! resp = req.GetResponseAsync()

       60           use stream = resp.GetResponseStream()

       61           use reader = new StreamReader(stream)

       62           return reader.ReadToEnd() }

       63

       64 // Load an RSS feed's contents into an XML document object

       65 // and use it to extract the titles and their links

       66 // Hopefully these always match (this could be coded more

       67 // defensively)

       68

       69 let titlesAndLinks (name, url, xml) =

       70   let xdoc = new XmlDocument()

       71   xdoc.LoadXml(xml)

       72

       73   let titles =

       74     [ for n in xdoc.SelectNodes("//*[name()='title']")

       75         -> n.InnerText ]

       76   let links =

       77     [ for n in xdoc.SelectNodes("//*[name()='link']") ->

       78         let inn = n.InnerText

       79         if  inn.Length > 0 then

       80           inn

       81         else

       82           let href = n.Attributes.GetNamedItem("href").Value

       83           let rel = n.Attributes.GetNamedItem("rel").Value

       84           if href.Contains("feedburner") then

       85               ""

       86           else

       87             href ]

       88

       89   let descs =

       90     [ for n in xdoc.SelectNodes

       91         ("//*[name()='description' or name()='content' or name()='subtitle']")

       92           -> n.InnerText ]

       93

       94   // A local function to filter out duplicate entries in

       95   // a list, maintaining their current order.

       96   // Another way would be to use:

       97   //    Set.of_list lst |> Set.to_list

       98   // but that results in a sorted (probably reordered) list.

       99

      100   let rec nub lst =

      101     match lst with

      102     | a::[] -> [a]

      103     | a::b ->

      104       if a = List.hd b then

      105         nub b

      106       else

      107         a::nub b

      108     | [] -> []

      109

      110   // Filter the links to get (hopefully) the same number

      111   // and order as the titles and descriptions

      112

      113   let real = List.filter (fun (x:string) -> x.Length > 0) 

      114   let lnks = real links |> nub

      115

      116   // Return a link to the overall blog, if we don't have

      117   // the same numbers of titles, links and descriptions

      118

      119   let lnum = List.length lnks

      120   let tnum = List.length titles

      121   let dnum = List.length descs

      122

      123   if tnum = 0 || lnum = 0 || lnum <> tnum || dnum <> tnum then

      124     [(name,url,url)]

      125   else

      126     List.zip3 titles lnks descs

      127

      128 // For a particular (name,url) pair,

      129 // create an AutoCAD HyperLink object

      130

      131 let hyperlink (name,url,desc) =

      132   let hl = new HyperLink()

      133   hl.Name <- url

      134   hl.Description <- desc

      135   (name, hl)

      136

      137 // Use asynchronous workflows in F# to download

      138 // an RSS feed and return AutoCAD HyperLinks

      139 // corresponding to its posts

      140

      141 let hyperlinksAsync (name, url, feed) =

      142   async { let! xml = httpAsync feed

      143           let tl = titlesAndLinks (name, url, xml)

      144           return List.map hyperlink tl }

      145

      146 // Now we declare our command

      147

      148 [<CommandMethod("rss")>]

      149 let createHyperlinksFromRss() =

      150

      151   // Let's get the usual helpful AutoCAD objects

      152

      153   let doc =

      154     Application.DocumentManager.MdiActiveDocument

      155   let db = doc.Database

      156

      157   // "use" has the same effect as "using" in C#

      158

      159   use tr =

      160     db.TransactionManager.StartTransaction()

      161

      162   // Get appropriately-typed BlockTable and BTRs

      163

      164   let bt =

      165     tr.GetObject

      166       (db.BlockTableId,OpenMode.ForRead)

      167     :?> BlockTable

      168   let ms =

      169     tr.GetObject

      170       (bt.[BlockTableRecord.ModelSpace],

      171       OpenMode.ForWrite)

      172     :?> BlockTableRecord

      173

      174   // Add text objects linking to the provided list of

      175   // HyperLinks, starting at the specified location

      176

      177   // Note the valid use of tr and ms, as they are in scope

      178

      179   let addTextObjects pt lst =

      180     // Use a for loop, as we care about the index to

      181     // position the various text items

      182

      183     let len = List.length lst

      184     for index = 0 to len - 1 do

      185       let txt = new DBText()

      186       let (name:string,hl:HyperLink) = List.nth lst index

      187       txt.TextString <- name

      188       let offset =

      189         if index = 0 then

      190           0.0

      191         else

      192           1.0

      193

      194       // This is where you can adjust:

      195       //  the initial outdent (x value)

      196       //  and the line spacing (y value)

      197

      198       let vec =

      199         new Vector3d

      200           (1.0 * offset,

      201           -0.5 * (Int32.to_float index),

      202           0.0)

      203       let pt2 = pt + vec

      204       txt.Position <- pt2

      205       ms.AppendEntity(txt) |> ignore

      206       tr.AddNewlyCreatedDBObject(txt,true)

      207       txt.Hyperlinks.Add(hl) |> ignore

      208

      209   // Here's where we do the real work, by firing

      210   // off - and coordinating - asynchronous tasks

      211   // to create HyperLink objects for all our posts

      212

      213   let links =

      214     Async.Run

      215       (Async.Parallel

      216         [ for (name,url,feed) in feeds ->

      217           hyperlinksAsync (name,url,feed) ])

      218

      219   // Add the resulting objects to the model-space 

      220

      221   let len = Array.length links

      222   for index = 0 to len - 1 do

      223

      224     // This is where you can adjust:

      225     //  the column spacing (x value)

      226     //  the vertical offset from origin (y axis)

      227

      228     let pt =

      229       new Point3d

      230         (15.0 * (Int32.to_float index),

      231         30.0,

      232         0.0)

      233     addTextObjects pt (Array.get links index)

      234

      235   tr.Commit()

    You can download the new F# source file from here.

    A few comments on the changes:

    Lines 57-62 define our new httpAsync() function, which uses GetResponseAsync() - a function exposed in F# 1.9.3.7 - to download the contents of a web-page asynchronously [and which I stole shamelessly from Don Syme, who presented the code last summer at Microsoft's TechEd].

    Lines 141-144 define another asynchronous function, hyperlinksAsync(), which calls httpAsync() and then - as before - extracts the feed information and creates a corresponding list of HyperLinks. This is significant: creation of AutoCAD HyperLink objects will be done on parallel; it is the addition of these objects to the drawing database that needs to be performed serially.

    Lines 214-217 replace our very simple "map" with something slightly more complex: this code runs a list of tasks in parallel and waits for them all to complete before continuing. What is especially cool about this implementation is the fact that exceptions in individual tasks result in the overall task failing (a good thing, believe it or not :-), and the remaining tasks being terminated gracefully.

    Lines 221 and 233 change our code to handle an array, rather than a list (while "map" previously returned a list, Async.Run returns an array).

    When run, the code creates exactly the same thing as last time (although there are a few more posts in some of the blogs ;-)

    Autocad_does_rss_3

    A quick word on timing: I used "F# Interactive" to do a little benchmarking on my system, and even though it's single-core, single-processor, there was a considerable difference between the two implementations. I'll talk more about F# Interactive at some point, but think of it to F# in Visual Studio as the command-line is to LISP in AutoCAD: you can very easily test out fragments of F#, either by entering them directly into the F# Interactive window or highlighting them in Visual Studio's text editor and hitting Alt-Enter.

    To enable function timing I entered "#time;;" (without the quotations marks) in the F# Interactive window. I then selected and loaded the supporting functions needed for each test - not including the code that adds the DBText objects with their HyperLinks to the database, as we're only in Visual Studio, not inside AutoCAD - and executed the "let links = ..." assignment in our two implementations of the createHyperlinksFromRss() function (i.e. the RSS command). These functions do create lists of AutoCAD HyperLinks, but that's OK: this is something works even outside AutoCAD, although we wouldn't be able to do anything much with them. Also, the fact we're not including the addition of the entities to the AutoCAD database is not relevant: by then we should have identical data in both versions, which would be added in exactly the same way.

    Here are the results:

    I executed the code for serial querying and parallel querying twice (to make sure there were no effects from page caching on the measurement):

    val links : (string * HyperLink) list list

    Real: 00:00:14.636, CPU: 00:00:00.15, GC gen0: 5, gen1: 1, gen2: 0

    val links : (string * HyperLink) list array

    Real: 00:00:06.245, CPU: 00:00:00.31, GC gen0: 3, gen1: 0, gen2: 0

    val links : (string * HyperLink) list list

    Real: 00:00:15.45, CPU: 00:00:00.46, GC gen0: 5, gen1: 1, gen2: 0

    val links : (string * HyperLink) list array

    Real: 00:00:03.832, CPU: 00:00:00.62, GC gen0: 2, gen1: 1, gen2: 0

    So the serial execution took 14.5 to 15.5 seconds, while the parallel execution took 3.8 to 6.3 seconds.

    TrackBack

    TrackBack URL for this entry:
    http://www.typepad.com/services/trackback/6a00d83452464869e200e54ff0919c8833

    Listed below are links to weblogs that reference Using F# Asynchronous Workflows to simplify concurrent programming in AutoCAD:

    » Using F# Asynchronous Workflows to simplify concur... from concurrent
    Bookmarked your post over at Blog Bookmarker.com! [Read More]

    » Kean Walmsley on using F# Asynchronous Workflows to simplify concurrent programming in AutoCAD from Don Syme's WebLog on F# and Other Research Projects
    On Friday Kean Walmsley posted an excellent article on Using F# Asynchronous Workflows to simplify concurrent [Read More]

    » Kean Walmsley on using F# Asynchronous Workflows to simplify concurrent programming in AutoCAD from Don Syme's WebLog on F# and Other Research Projects
    On Friday Kean Walmsley posted an excellent article on Using F# Asynchronous Workflows to simplify concurrent [Read More]

    » Kean Walmsley on using F# Asynchronous Workflows to simplify concurrent programming in AutoCAD from Noticias externas
    On Friday Kean Walmsley posted an excellent article on Using F# Asynchronous Workflows to simplify concurrent [Read More]

    Comments

    http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Arrow.html

    Verify your Comment

    Previewing your Comment

    This is only a preview. Your comment has not yet been posted.

    Working...
    Your comment could not be posted. Error type:
    Your comment has been posted. Post another comment

    The letters and numbers you entered did not match the image. Please try again.

    As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

    Having trouble reading this image? View an alternate.

    Working...

    Post a comment

    Feed & Share

    Search