In an attempt to model the way humans see the world, we'll first "categorize" the raw data for a specific moment into patterns by segmenting the image into self-similar patches. Then, these patches will be stitched together into adjacency patchwork. This patchwork would describe each object in the scene by "flattening" out the object into a scale-varying plane.

The problem definition is recursive, allowing for subsets of the data to be "worked on," catering towards parallelism.

Stage 0 - Data Preparation

Here, we're going to convert the video streams into a set of image sequences. Each continuous stream will be labeled with a letter (a, b, c, …), and each frame in the sequence will be 0-based numbered (0000, 0001, …).

Then, these set of images will need to be "prepped" by an outlining / edge-detection filter.

Segment utility segments the image into parts, giving each part a unique color. Look at each part and find a generalized (average?) characteristic from the original. (For instance, a piece of paper would generally have the color of the paper… ) This part characteristic will assist in tracking parts through the time dimension, as the characteristics will remain mostly the same. If it varies too greatly, it's going to be difficult to pull information from the image sequence anyways. In other words, we're looking for a "fingerprint" for the patch that can be tracked.

Data Preparation

Stage 1 - Data Segmentation and Flattening

  1. Segment the images into self-similar patches
    1. Edge the images to approximate edges of patches
      1. Crawler?
      2. Use outlined and original image as a guide
    2. Describe patch by a generalized histogram - ex: find mean and for each channel
    3. Match patches within image sequence - assume that the patches do not move "greatly" from frame to frame

Stage 2 - Stitching Patchwork and Testing Coherency

  1. Stitch the patches into adjacent patchworks
    1. Use "feedback loops" to test temporal / spatial coherence
    2. If texture A and B are adjacent in frame f but not in frame g, either they're not adjacent on the object's surface or there is something occluding the joint between A and B.
  2. "Flatten" surface texture - Find highest quality, average level set of pixels that best describe the texture

Stage 3 - 3D Reconstruction

  1. SIFT and SFM ? to create sparse 3D reconstruction
  2. Describe how all permutations change the texture's representation (shade, highlight, etc.)
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License