While on my way to San Francisco, last weekend, I had a 7-hour layover in Washington D.C. Given the amount of time available to me I decided to head from Dulles into the centre of the city and do something fun: I ended up choosing to visit the Natural History Museum at The Smithsonian, which proved to be really interesting even to my somewhat bleary eyes.
Part of my motivation was to try to capture something for bringing into Photosynth (as I’d borrowed my wife’s digital SLR for taking some snaps at the wedding I was attending in Las Vegas). I’m becoming increasingly fascinated with the field of computer vision (I’ve talked in the past about some of the possibilities it brings to our industry), and am doing what I can to get to know more about it, which will prove especially important if I end up talking about it at AU.
Before I show – and talk about – the results of these efforts, here are the basic steps for creating a Photosynth.
Firstly you clearly need to take pictures. I won’t give much advice on how to do this – as I seem not to be getting the best results, myself – but the trick appears to be to make sure you take enough pictures of a chosen target that has a lot of “features” which can be matches between these images. A point needs to be shared across a minimum of three photos taken from different angles, it seems, for it to be coordinated properly in 3D space.
Once you have a set of snaps, go to the Photosynth website and select the “Upload” button at the top-right of the page followed by the “Create a Synth” button on the following page (at you will need to log in via your Microsoft Live login). You should see a UI allowing you to select images and name your synth:
Once you’ve specified basic information for your synth and have selected the photos to use, you can choose the “Synth” button, which will fire off a sequence of operations:
Now despite the fact that there’s a web service providing access to Photosynth data, a lot of the heavy lifting needed to create the synth (such as extracting the image features and matching them across the various images) is performed in the “Synther” part of the Photosynth client, while the photos themselves are being uploaded: a systems architecture apparently known as edge computing. While in some ways being an efficient use of computing resources, this approach does appear to limit the ability to just use the web service to create synths – something I would love to do – which is a shame. [If someone is able to tell me I’m wrong about this, I’ll very happily buy them a beer. :-)]
It’s also important to note that the algorithm used to match features between images is non-linear – in this case each image needs to be compared with each of the others, from what I understand, which makes it an n-squared problem – so you will see the time taken get proportionally longer as you add images to your set. I’m very interested as to why the decision was made to run these operations locally, rather than to just throw them at “the cloud”. Perhaps the economics behind a potentially high-usage service such as Photosynth doesn’t/didn’t lend itself to performing such operations centrally, and that it makes sense for the client to take a share of the burden.
I created a few synths of skeletons from images taken at The Smithsonian:
Oh deer A skinny bird A scary looking beastNone of these synths came out especially well, in my opinion: they have quite low “synthiness” scores, which reflects how many of the images contributed to the point cloud, probably because I felt a bit foolish taking more than 30 or so pictures of any one thing in a museum and also the varying effects of the flash probably hampered the ability to stitch these images together.
One more thing to bear in mind: Photosynth really only provides a “sparse” point cloud – the points needed to help coordinate the camera positions of the various images. It’s clearly possible to sample more points to create a more dense cloud once you have accurate camera positions, but this isn’t currently the focus of the Photosynth service, and so expectations should be set appropriately. That’s not to say it’s not possible to create clouds of 1M+ points using Photosynth – I’ve seen that happen – but it really takes a lot of dedication to get up to that level of detail.
As Blaise Aguera y Arcas states, it’s possible to densify point clouds coming out of systems such as Photosynth – by using tools such as MVSCPC – but for that you need that camera location information (which does appear to be accessible, based on the latest version of Christoph Hausner’s exporter). Just as point clouds are really only a by-product of Photosynth, for a 3D editing system with computer vision/photogrammetry integrated more closely, the need to make use of the base point cloud is reduced: you could be generating a mesh or even some more appropriate 3D geometry instead.
Near the end of my week in the Bay Area, I spent some time chatting about this field with Brian Mathews, the VP of Autodesk Labs. Brian is a scarily bright fellow with deep insights into many different areas of technology. To demo something his team is working on – once again, watch this space :-) – he took a set of photos that I’ve also uploaded into Photosynth:
Kean’s headTo get a feel for the results, here’s a screenshot of the point cloud inside AutoCAD. You can try this yourself using the Photosynth import plugin I’ve shown in recent posts. It’s a very light point cloud – just 9,425 points from the 31 source images – but at least it looks like me from a distance. :-)
Thanks to Nate Lawrence for pointing me at a related discussion on some of the above issues (I’d been planning a post along these lines, but when Nate pinged me it reminded me of some relevant resources he listed therein).