YKHD

6/6/18, 1:55 AM: Computer Vision: Bundle Adjustment

In the application of computer vision in 3D reconstruction, bundle adjustment is the optimization of various parameters relating to the reconstruction. Most understandably, we might want to optimize the 3D locations of viewed features and camera positions from a stereo reconstruction.

Optimization
Optimization implies that there is an error to minimize in a system. In this case, the system is the 3D reconstruction, containing all of the reconstructed points, and for simplicity of explanation, two camera positions that viewed the reconstructed points. The error in the system is a little less straightforward to explain.

To compute an error in the system, we must first be able to make a measurement, and compare it to an observation. The measurement to be made here is the triangulated position of features viewed by both cameras. Via triangulation, a 3D position of a feature can be found, given the location estimates of both cameras, and the bearing of the 3D feature from the cameras, obtained using 2D image coordinates. Several of these 3D points are measured, by iteratively matching 2D image features, and then triangulating their 3D positions. The 3D point locations are computed assuming the matches, camera locations, and 2D image coordinates are accurate, which they are exactly not, and this is where the error arises. How the error is measured is fairly straightforward. The (probably inaccurate) estimates of camera poses are known, and their calibration matrices are known, so their projection matrix estimates are known. One may then project a reconstructed 3D point back onto the image plane of both cameras (one at a time for now).

Certainly, because of the assumptions that the triangulation process makes about the accuracy of the parameters passed to it, a 3D point reprojected back onto the image plane will not lie exactly on its parent 2D feature. This disparity is called the reprojection error and it is the error we seek to minimize. This reprojection error is summed up analytically for all the 3D features seen in all camera frames, a bundle of camera frames and hence it gets its name, bundle adjustment. The differential of this function is then the cost function, and is used in a minimization problem. For the curious, this is a task well suited for an optimization library, something like Ceres-Solver (BSD licensed), by google.

Parameters
In order to obtain the cost functor for one observation of a 3D feature, the following parameters are required:
1. Focal length
2. Distortion parameters
3. 3D View pose (translation, rotation)
4. 3D Feature position (translation)
5. 2D Feature position

The first four parameters are for computing the projected 2D feature position on the image plane that had observed the 3D feature. The last parameter is an immutable (unchangeable) parameter for comparison, and computation of the reprojection error. Detection of a 2D feature is the least unstable, and is thus used as some kind of ground truth to compare the measurement against.

A unique observation is made when a camera sees and recognizes a 3D point in the reconstruction from the 2D features in the image it contains. Hence, for each 3D point that is seen by each camera, an observation consisting of a unique combination of the above parameters exists.

For example, 3 3D features are seen and shared by two camera frames. Then, 6 different combination of the above 5 parameters exist.

5/3/18, 11:23 AM: ORBSLAM2: First look

That is the reconstruction of my office hallway using ORBSLAM2.

ORBSLAM2 is an open-source vision slam implementation with a General Public License (GPL), meaning anyone may use the software for any purpose (and other things). There is some considerable set up involved to get it to work, and you need some idea of compiling c++, and I did all of it on VirtualBox running Ubuntu 14.04, and a native Windows OS.

Note the "double-vision" effect of the points near the bottom of the reconstruction. The batch adjustment algorithm would take care of that if I continued walking around a bit more.

The supposed scale-drift effect as described in the ORBSLAM2 paper by Raulmur cannot really be seen here (partly because I don't have a ground truth image) and based on my own estimation, it looks accurate. This is because I took care to make proper reconstructions while moving the camera. There is probably room for innovation in a method to move the robot such that proper reconstructions can be done. Alternatively, a stereo set up may be used to obtain depth information, as Raulmur also suggests.

Besides the scale-drift problem, in order to make ORBSLAM2 useful, a way to save and load the map created by your motion should be available. One application may not require simultaneous localisation and mapping, once a reliable map has been constructed, only localisation is needed. A robot can then be configured to be more specialised for other tasks besides SLAM. Since this implementation runs a multi-threaded process, a separate thread must be created so as to not interrupt the other threads running in the background.

Getting the coordinates of MapPoints

There are a few methods that will help to get these coordinates, and I have listed them here.

GetAllMapPoints() is a method of Map that returns a vector list of the pointers of all the MapPoint type objects that have been created by the program and visualised on the Pangolin viewer.

GetWorldPos() is a method of the MapPoint type that will return the absolute coordinates of the object in a cv::Mat (OpenCV matrix) object.

With these in mind, after setting up a thread for user input (including syncing it with the other threads, this is actually the hard part), getting the 3D map information should be a matter of conversion from list of MapPoint objects, to list of cv::Mat objects containing just the absolute coordinates of all 3D points.

It is also worth noting that if the goal is to reuse the MapPoint objects, it is necessary to check how the Map is initialized, so the load function may initialize the Map.

Further, the default initialisation causes the Tracking thread to start initializing. However, once a Map is loaded, we want the tracking thread to start by relocalization.

Saving Coordinates

Usually this wouldn't be an issue, but ORBSLAM runs multithreaded, so a new thread might be necessary in order to process user (or other programs') input, much like the pangolin viewer. To simplify this, for now, placing the save or load functions in the initialisation and shutdown respectively might be more efficient.

Saving MapPoint Objects

I did not expect this to be a problem, but since the tracking, mapping and loop closing threads need MapPoint Objects to work, it is necessary to save all the information that they carry, in order to reuse them in a Load function later. After some searching I determined that the best way to do that is through the Boost library. For some reason, the boost documentation did not specify which libraries to link, and I had some trouble compiling the demo program.

In the end, I found here that the library flag for boost/file_system is -lboost_filesystem-mt, and by a bit of trial and error, since I was trying to use the serialization library, the library flag to use is -lboost_serialization (American spelling) and tadaaah it works.

I used:

g++ source_file.cpp -l boost_<relevant_library> -o target_name

There is also the option of using CMake to help to find your package, which, if you know how, is considerably easier. Using the find_package(Boost REQUIRED) command in CMake, the libraries and include directories are easily found.

TODO: how to use CMake.

After getting the boost libraries to work, we need to get boost to help us save and load the Map or MapPoint Objects - whichever is more convenient. It seems now, based on my understanding of boost, that it would make sense to just save the entire Map object. Initially I thought that boost would not be able to detect pointers, and try to serialize them (which doesn't make sense, because pointers just contain memory addresses, and not the more valuable information), but after looking into it a little more I found that the boost library is more intelligent than I initially thought. Boost will be able to identify pointers, look for the Object that it is pointing at, then serialize and track it. Boost knows what object it is looking at based on the address that it was taken from, and will not serialize and archive repeat objects.

So this means, in an object of class Human called brian, containing pointers to other objects of class Car and class House called car_ptr and house_ptr respectively, even though car_ptr and house_ptr contain addresses rather than the actual objects, Boost will look for the object that lives at the address given by the pointers, and serialize them to be archived.

So one tricky thing is that when you want boost to archive a custom something, you need to give it access to the information inside the object. This is done by sticking this member function into the class definition

// Allow serialization to access non-public data members.
friend class boost::serialization::access;

// Serialize the std::vector member of Info
template<class Archive>
void serialize(Archive & ar, const unsigned int version)
{
ar & member_variable_name_1;
ar & member_variable_name_2;
ar & member_variable_name_3;
}

serialize is a function used by Boost when you use the operands "& ", ">>" and "<<" to store/retrieve objects from archives. The first line gives Boost access to all the data, private and public, declared and stored in this particular class of object. You can choose to serialize multiple member variables inside your object, and omit several as well.

I am still unsure what happens to the member variables that are not saved when the entire object is reloaded at a separate occasion. I assume they will default to undefined values, can test this tomorrow.

Further, since Boost looks for the object that is referenced by the pointer (car_ptr and house_ptr from above), this means that the referenced object must also contain a serialize function, and each referenced object within that one, and so on. According to this, if your class contains a non-primitive object (things that aren't defined in the standard library), you have to add the above code to the definition of that object as well in order for it to be serializable. This implies that for all objects that Map contains, I have to add the serialization function in their class definition.

This way Boost will be able to save an ENTIRE complex object to an archive - including the pointer references, and reconstruct the saved object in its entirety, in a different time and place. Ideally, I will be able to save and reload the whole Map object this way.

The next step would be to investigate the relocalization (American) process, and find out how to start the SLAM with a relocalization.

10/24/16, 9:47 AM: Cheers mates.

this is it. i got what i asked for. every minute, every moment, every sweet and thoughtful conversation, to every confused, irrational, fearful exchange. as much as i hate to say it, it was bound to happen and i knew it deep down and perhaps, in some way, it was a self-fulfilling kind of eventuality. i'm such a moron dammit. its kind of funny, in a sort of twisted self-loathing kind of way. i compared it all to skating and climbing and all the other things that you can just grind away at yourself. woop big surprise doesn't work that way. and dammit. fucking dammit it hurts so bad and doesn't go away. there's a whole lot of self-doubt going on. and much more i can't even pinpoint. lost the friendship, lost the trust, lost the companionship... my fucking big mouth. i hate my stupid shit eq, lack of cool. god dammit dammit all dammitdammit.

11/3/15, 10:42 PM: what's your crisis?

Maybe sometimes we spend too much time seeking happiness, we forget to just be happy.

I need some time to think. I want to leave some things in my record I realise, so I can pick it up and think about it later.

I do have some things in mind.

These days I find myself fearing what others think of me. There seems to be a few ways to go about it - and a lot of grey in between. Cast it aside, tread boldly my own way. Pick up from what they say and change accordingly...

I have no time right now. I'll be back.

10/21/15, 7:22 AM: well, hello.

Hmm...

A lot has changed? Or maybe you could say nothing has changed in the grand scheme of things... Or maybe more accurately you could say things have been changing recently. I always get the temptation of going back where I came. There is comfort in the thought of the tried and tested. It is an ever present fear. When what you're trying just stops working, you want to make a reversion.

I tell myself that there's no way back. All you can do is take comfort in the thought of what was. There is greater certainty, with what experience you have now, compared to what you had, that there is more pain ahead, but also bigger lessons to learn from.

to my surprise... it seems to follow those who seek it.

Looking back, I can tell you how functional I was, how many good decisions I made for myself. But also how cold it felt. It could be, the beginning of uni gave me a start point. A place to make some daring changes. Daring by my own standards.

I'm not sure if this could pass of as a confession, but often I am confronted with a scary thought. I don't have an affinity with people. It's true. It's a defining characteristic of what I am. I don't manage to appreciate everyone me all at once. Sometimes it almost feels like I only want to be near people I can benefit from.

Older Posts