Flyby

Week 9 •

This week, I dipped into three different things, giving me a nice base to work off for future projects. The first was continuing my research towards implementing Gaussian Splatting, learning about Structure from Motion, a technique for creating a point cloud from several images of the same area. Second, I got up to speed on Tensorflow, a library for machine learning. Lastly, I got started with iOS development, and built a couple of toy apps.

Structure From Motion

Structure from Motion (SfM) is a technique for creating a set of points in 3D space from a set of images. It's a fascinating technique. Most notably, unlike older techniques for creating 3D scenes from 2D images, it doesn't require the photos to be taken from a similar distance/elevation. All it needs is enough overlap across multiple images of the same object or area.

The gist of the technique is this:

  1. Run a "feature" detection algorithm for finding identifiable areas of the scene. An example of this is scale-invariant feature transform (SIFT), which uses the Difference of Gaussians (DoG) at multiple image scales to find similarly-shaped blobs across multiple images.
  2. Compare these features across images and match them up. If a blob has multiple potential matches, it is a less reliable feature, and may be discarded
  3. Triangulate the position of the feature in 3D space. This can be done with various techniques from epipolar geometry, which describes the relationships between geometry from two different perspectives, as with stereo vision. If you've ever used a View-Master, you've experienced this first hand!

Images from the View-Master Wikipedia article:

ImageA View-Master

ImageA View-Master reel

If you look closely at the reel above, you can see that the images on opposite sides of the reel are similar, but not exactly the same. They come from a stereoscopic rig with two cameras, where each camera is about an eye-width apart.

Tensorflow

Tensorflow is a set of tools for working with machine learning concepts and has pretty good support for a simple feed-forward network, like the ones I made last week. The Coding Train has a set of videos where they introduce Tensorflow.js, a JavaScript implementation of Tensorflow (which is written in C++ but is most often used through python bindings)

I don't have too much to say about Tensorflow, other than that it makes the concepts I learned about last week trivial to implement. It uses some clever WegGL tricks to perform computation on the GPU, as neural networks often rely on matrix operations which can be parallelized for a big speedup.

I've played around with PyTorch, another ML library in Python, but the web versions of it aren't as well-maintained as Tensorflow.js, so I'll likely use that if I end up implementing ML networks in JavaScript in the future.

iOS Development

The last thing I did was learn more about iOS development and renew my (very old and very expired) Apple developer's license. In the years since I've done iOS stuff, the newer development language (Swift) has improved substantially. I enjoy using Swift. It shares some concepts with F# and Kotlin (The language often used to develop Android apps). The newer UI library, SwiftUI, and the data management library, SwiftData, are especially nice to use.

I followed a few tutorials, ending up with a couple of very simple apps for managing small amounts of data and their relationships. I've got an idea for a simple app to collect pictures, audio, or video onto "cards" that hold their context, so I may have a prototype of that soon. The one sticking point for now is that using the camera tends to be slightly more complicated than simply picking photos that are already taken, but I'm not too worried about that. It seems fairly straightforward once I get my head around some of the Swift concepts involved.

Wrapping Up

I don't have just one "thing" to show you this week, sorry! I did, however, learn a lot. The iOS stuff is pretty exciting, since I think I'll be able to throw together simple apps which solve problems I have (and maybe some that other people have too).

I'm travelling next week, and won't have too much time to work on a project, but I'll still post what I learn and anything interesting I end up with on Sunday.

See you then!