UMT-Computer-Vision-Spring-2019 · addiboyer24 · Jan 25, 2019 · Jan 25, 2019 · Jan 25, 2019 · Jan 25, 2019
diff --git a/.DS_Store b/.DS_Store
diff --git a/.ipynb_checkpoints/main-checkpoint.ipynb b/.ipynb_checkpoints/main-checkpoint.ipynb
diff --git a/.ipynb_checkpoints/main_NT-checkpoint.ipynb b/.ipynb_checkpoints/main_NT-checkpoint.ipynb
diff --git a/.ipynb_checkpoints/project_description-checkpoint.ipynb b/.ipynb_checkpoints/project_description-checkpoint.ipynb
@@ -0,0 +1,243 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Project 1: Pose estimation\n",
+    "## Due Jan. 30th\n",
+    "\n",
+    "### Assigned Reading: Szeliski Ch 2.3, Ch. 6.2\n",
+    "\n",
+    "### Problem description\n",
+    "One of the most common computer vision tasks, particularly for things like practical robotics, is called *pose estimation*.  *Pose* is simply the computer vision term for the vector\n",
+    "$$\n",
+    "\\mathbf{p} = [X_{cam},Y_{cam},Z_{cam},\\phi,\\theta,\\psi],\n",
+    "$$\n",
+    "where the first three elements of the vector are the position of a camera and the last three elements are its yaw, pitch, and roll.  *Pose estimation* is simply determining these values from an image.  \n",
+    "\n",
+    "How is this done?  Imagine that we have identified the real-world coordinates $\\mathbf{X}_i$ of several features that are easily identified, and fit in one photograph.  We'll call them ground control points (GCPs).\n",
+    "<img src=\"gcp.jpg\">\n",
+    "Using code that we've already developed, we can simulate where these GCPs should project to in the image.  If we already know the correct pose, when we perform this projection, the projection of the GCPs (the steeple of M, for example), should be collocated with that feature in a real image that we took with the camera.  This is a good way of ensuring that our camera model is correct.  \n",
+    "\n",
+    "However, usually the pose is not known *a priori*.  Instead, we need to find the pose that reduces the misfit between the projection of the GCPs, and their identified location in the image.  At its core, you can think of this as a least-squares problem: adjust the pose of the model camera such that the squared difference between the projection of the GCP and its location in the image is minimized.  We can write this mathematically as:\n",
+    "$$\n",
+    "\\mathbf{p}_{opt} = \\mathrm{argmin}_{\\mathbf{p}} \\frac{1}{2} \\sum_{i=1}^n \\sum_{j=1}^2 (f(\\mathbf{X}_i,\\mathbf{p})_j - \\mathbf{u}_{ij})^2,\n",
+    "$$\n",
+    "where $n$ is the number of GCPs, and $f(\\mathbf{X},\\mathbf{p})$ is the projection of real world coordinates $\\mathbf{X}$ into camera coordinates (which depends on the pose $\\mathbf{p}$, and $\\mathbf{u}$ is the pixel coordinates of the equivalent point in the image.  When properly formulated, this minimization problem is straightforward to solve.  The classic method for doing so is the [Levenberg-Marquardt algorithm](https://en.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm), which is a generalization of Newton's method and Gradient descent.  \n",
+    "\n",
+    "### Software Requirements:\n",
+    "Your assignment is to develop a camera model that has the capability to perform pose estimation.  It should be structured as a Python class with (at least) the following methods:\n",
+    "* A method for performing the projective transform\n",
+    "* A method for performing the transformation from world to generalized camera coordinates\n",
+    "* A method for estimating pose, given ground control points (an excellent python implementation of Levenberg-Marquardt can be found [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html).)\n",
+    "\n",
+    "A skeleton for this class might be:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "class Camera(object):\n",
+    "    def __init__(self):\n",
+    "        self.p = None                   # Pose\n",
+    "        self.f = None                   # Focal Length in Pixels\n",
+    "        self.c = np.array([None,None])  #\n",
+    "        \n",
+    "    def projective_transform(self,x):\n",
+    "        \n",
+    "        focal = self.f\n",
+    "        sensor = self.c\n",
+    "        \n",
+    "        #General Coordinates\n",
+    "        gcx = x[0]/x[2]\n",
+    "        gcy = x[1]/x[2]\n",
+    "        \n",
+    "        #Pixel Locations\n",
+    "        pu = gcx*focal + sensor[0]/2.\n",
+    "        pv = gcy*focal + sensor[1]/2.\n",
+    "        \n",
+    "        pass\n",
+    "    \n",
+    "    def rotational_transform(self,X):\n",
+    "        \"\"\"  \n",
+    "        This function performs the translation and rotation from world coordinates into generalized camera coordinates.\n",
+    "        \"\"\"\n",
+    "        pass\n",
+    "    \n",
+    "    def estimate_pose(self,X_gcp,u_gcp):\n",
+    "        \"\"\"\n",
+    "        This function adjusts the pose vector such that the difference between the observed pixel coordinates u_gcp \n",
+    "        and the projected pixels coordinates of X_gcp is minimized.\n",
+    "        \"\"\"\n",
+    "        pass\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Testing Requirements\n",
+    "You should test this code on real world imagery of your own making.  Go out into the world and take a photograph of a scene in which you will be able to identify real world coordinates.  As an example (which you are free to emulate), I took a photograph of main hall from the oval (see above).  In the background was the M, along with a few other things.  I selected several prominent features in my image, recorded their image coordinates, then used google earth (with coordinates set to UTM mode) to determine their location in world coordinates:\n",
+    "\n",
+    "| u  | v  | Easting | Northing | Elevation | Description    |\n",
+    "\n",
+    "|----|----|---------|----------|-----------|----------------|\n",
+    "\n",
+    "|1984|1053|272558.68|5193938.07|1015       |Main hall spire |\n",
+    "\n",
+    "|884 |1854|272572.34|5193981.03|982        |Large spruce    |\n",
+    "\n",
+    "|1202|1087|273171.31|5193846.77|1182       |Bottom of left tine of M|\n",
+    "\n",
+    "|385 |1190|273183.35|5194045.24|1137       |Large rock outcrop on Sentinel|\n",
+    "\n",
+    "|2350|1442|272556.74|5193922.02|998        |Southernmost window apex on main hall|\n",
+    "\n",
+    "I saved this table as a txt file, which I read and then use in my estimate_pose function.  \n",
+    "\n",
+    "### Additional notes\n",
+    "* The pose vector has six elements.  Each ground control point has two observations ($u$ and $v$).  How many points are needed to fully constrain the minimization problem?  (note that more observations is always better, but there is a minimum for the problem to be well posed)\n",
+    "\n",
+    "* You will need to determine the focal length from your camera.  To do this you will need to read the image's [Exif file](https://en.wikipedia.org/wiki/Exif).  Many image viewers (eye of gnome, for example) will do this automatically.  Look under Properties.  Alternatively, the Linux command line tool imagemagick can be used:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "    exif:ApertureValue: 185/100\n",
+      "    exif:BrightnessValue: 0/100\n",
+      "    exif:ColorSpace: 1\n",
+      "    exif:ComponentsConfiguration: 1, 2, 3, 0\n",
+      "    exif:DateTime: 2019:01:22 12:48:36\n",
+      "    exif:DateTimeDigitized: 2019:01:22 12:48:36\n",
+      "    exif:DateTimeOriginal: 2019:01:22 12:48:36\n",
+      "    exif:ExifImageLength: 2448\n",
+      "    exif:ExifImageWidth: 3264\n",
+      "    exif:ExifOffset: 238\n",
+      "    exif:ExifVersion: 48, 50, 50, 48\n",
+      "    exif:ExposureBiasValue: 0/10\n",
+      "    exif:ExposureMode: 0\n",
+      "    exif:ExposureProgram: 2\n",
+      "    exif:ExposureTime: 1/3230\n",
+      "    exif:Flash: 0\n",
+      "    exif:FlashPixVersion: 48, 49, 48, 48\n",
+      "    exif:FNumber: 190/100\n",
+      "    exif:FocalLength: 291/100\n",
+      "    exif:FocalLengthIn35mmFilm: 27\n",
+      "    exif:GPSDateStamp: 2019:01:22\n",
+      "    exif:GPSInfo: 6272\n",
+      "    exif:GPSTimeStamp: 19/1, 48/1, 36/1\n",
+      "    exif:GPSVersionID: 2, 2, 0, 0\n",
+      "    exif:ImageLength: 2448\n",
+      "    exif:ImageUniqueID: R08QSJA00AA\n",
+      "    exif:ImageWidth: 3264\n",
+      "    exif:InteroperabilityOffset: 6242\n",
+      "    exif:ISOSpeedRatings: 50\n",
+      "    exif:LightSource: 0\n",
+      "    exif:Make: samsung\n",
+      "    exif:MakerNote: 7, 0, 1, 0, 7, 0, 4, 0, 0, 0, 48, 49, 48, 48, 2, 0, 4, 0, 1, 0, 0, 0, 0, 32, 1, 0, 12, 0, 4, 0, 1, 0, 0, 0, 0, 0, 0, 0, 16, 0, 5, 0, 1, 0, 0, 0, 90, 0, 0, 0, 64, 0, 4, 0, 1, 0, 0, 0, 0, 0, 0, 0, 80, 0, 4, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 3, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0\n",
+      "    exif:MaxApertureValue: 185/100\n",
+      "    exif:MeteringMode: 2\n",
+      "    exif:Model: SM-J727V\n",
+      "    exif:Orientation: 1\n",
+      "    exif:ResolutionUnit: 2\n",
+      "    exif:SceneCaptureType: 0\n",
+      "    exif:SceneType: 1\n",
+      "    exif:SensingMethod: 2\n",
+      "    exif:ShutterSpeedValue: 11657/1000\n",
+      "    exif:Software: J727VVRS2BRK1\n",
+      "    exif:thumbnail:Compression: 6\n",
+      "    exif:thumbnail:ImageLength: 384\n",
+      "    exif:thumbnail:ImageWidth: 512\n",
+      "    exif:thumbnail:InteroperabilityIndex: R98\n",
+      "    exif:thumbnail:InteroperabilityVersion: 48, 49, 48, 48\n",
+      "    exif:thumbnail:JPEGInterchangeFormat: 6480\n",
+      "    exif:thumbnail:JPEGInterchangeFormatLength: 14017\n",
+      "    exif:thumbnail:Orientation: 1\n",
+      "    exif:thumbnail:ResolutionUnit: 2\n",
+      "    exif:thumbnail:XResolution: 72/1\n",
+      "    exif:thumbnail:YResolution: 72/1\n",
+      "    exif:UserComment: 65, 83, 67, 73, 73, 0, 0, 0, 10, 0, 0, 0, 65, 76, 67, 83, 73, 73, 70, 48, 65, 119, 98, 0, 0, 0, 0, 0, 99, 111, 114, 101, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 2, 0, 3, 0, 0, 208, 106, 55, 112, 111, 112, 0, 0, 0, 153, 153, 153, 153, 22, 32, 22, 17, 37, 23, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 65, 119, 98, 0, 0, 0, 0, 0, 83, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 2, 0, 3, 0, 0, 208, 106, 55, 112, 111, 112, 0, 0, 0, 153, 153, 153, 153, 22, 32, 22, 17, 37, 23, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 74, 74, 67, 83, 73, 73, 70, 53, 59, 187, 9, 0, 154, 196, 1, 0, 0, 0, 1, 0, 4, 140, 1, 0, 84, 50, 2, 0, 101, 187, 254, 255, 71, 18, 0, 0, 141, 162, 255, 255, 5, 184, 1, 0, 111, 165, 255, 255, 197, 4, 0, 0, 28, 54, 255, 255, 32, 197, 1, 0, 1, 0, 0, 0, 165, 82, 0, 0, 1, 88, 0, 0, 133, 82, 0, 0, 223, 87, 0, 0, 57, 23, 229, 24, 0, 0, 0, 188, 38, 82, 115, 87, 85, 247, 0, 0, 0, 0, 1, 1, 232, 87, 144, 82, 226, 87, 136, 82, 133, 238, 0, 0, 0, 0, 1, 0, 80, 18, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 65, 129, 0, 0, 15, 153, 0, 0, 212, 128, 0, 0, 81, 154, 0, 0, 216, 0, 1, 0, 233, 253, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 51, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 0, 190, 11, 0, 0, 152, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 65, 76, 67, 69, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 164, 164, 164, 164, 164, 164, 164, 164, 81, 95, 65, 70, 0, 15, 1, 4, 39, 1, 0, 0, 0, 0, 0, 14, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 200, 1, 74, 2, 124, 5, 162, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 24, 217, 30, 0, 227, 48, 30, 0, 242, 112, 27, 0, 2, 221, 22, 0, 0, 0, 0, 0, 134, 170, 23, 0, 109, 234, 28, 0, 73, 86, 32, 0, 205, 80, 33, 0, 108, 127, 33, 0, 225, 107, 30, 0, 0, 0, 0, 0, 29, 190, 33, 0, 197, 169, 34, 0, 35, 189, 35, 0, 142, 230, 35, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 23, 1, 15, 1, 7, 1, 255, 0, 247, 0, 239, 0, 247, 0, 255, 0, 7, 1, 15, 1, 23, 1, 31, 1, 31, 1, 27, 1, 23, 1, 19, 1, 15, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0\n",
+      "    exif:WhiteBalance: 0\n",
+      "    exif:XResolution: 72/1\n",
+      "    exif:YCbCrPositioning: 1\n",
+      "    exif:YResolution: 72/1\n",
+      "    Profile-exif: 20503 bytes\n"
+     ]
+    }
+   ],
+   "source": [
+    "%%bash\n",
+    "identify -verbose campus.jpg | grep \"exif:\"\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "phone cameras typically report focal length in 35mm equivalent.  Confusingly, to get focal length in pixels, divide this number by *36*, then multiply by the width of the image in pixels.  Hence, for this image, the focal length is "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2448.0\n"
+     ]
+    }
+   ],
+   "source": [
+    "f_length_35 = 27\n",
+    "img_width = 3264\n",
+    "\n",
+    "f_length = f_length_35/36*img_width\n",
+    "print(f_length)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/Misc. Items/Cameralocation.png b/Misc. Items/Cameralocation.png