watch alerts: a physical training platform using computer vision: June 2015

The perception of images through light excels at perspective geometry which is an abstract, generalized simplification form of geometry. This is in contrast to projective geometry which is 3D. The difference between the two is projective geometry it is not possible to talk about angles as in a Euclidean space such as perception geometry. A good tool to get familiar with the topics discussed in this blog is to play with geogebra. All objects that have energy produce sounds of their own. Perhaps these sounds are so constant that they are impossible for a human to perceive them. However watchalerts uses moving objects so we can measure redshift. By using a wave slower than light, such as florescence and also sound, it makes a more accurate image. The speed of sound in air is 340 meters abd moving through a slower medium it is accelerated by a factor of four. The speed of light in air is 300 million meters per second. The velocity of a wave as a single wave propogation frequency is called a phase. A phase retractive index is how much slower the speed of light is traveling as relative to the speed of light. Augmented reality algorithms developed by Qualcomm Vuforia takes multiple pictures and finds points of commonality between them. Acoustics is capable of noticing sounds and finding commonalities between them similar to Vuforia. This is known as homography Vuforia then draws an image based on mathematical principles using a codebook which is a list of reconstruction levels such as distict object edges, colors such as gradients and other features. Modeling human perception, these pay attention to the destinction between an object and scene perception. Also like human perception the reconstruction uses top down feature finding. I am looking at the possiblity of using a feature-gate, which uses top down and bottom up feature finding. These rely on visual saliency to determine features. Some features consist of RGB color, color saturation, and orientation (edge and symmetry), contrast, foreground, background. These filter dictionaries are capable of deriving multi-images. Sound is very much like light as described by quantum physics. Like light, interference is coming from a number of different sources and due to scatter. We perceive based on how the wave forms interact with the object characteristics about the object.

I've also looked at AAC audio encoding, they do compression through finding distortions through overlaying frames slightly over the preceeding and succeeding frames to make sure the energy is consistant. In our case we depend on automatic color equalization and using mean square error and peak signal to noise. Once detected, we calculate the wavelength of the noisy signal and modify it so it is within range. I am looking at doing degaussing.

There is much work done with OpenCV and ARToolkit to detect people as regions of interest. Despite the fact that as of 2014 human face detection by computers outperformed humans (as described in the book Data and Goliath), even the best well known cloud computing platforms are not perfect. Facebook has difficulty with thresholds of image detection. The paper Image Analysis and Understanding Techniques for Breast Cancer Detection from Digital Mammograms discusses thresholds when abnormalties are present. They do pre-processing through histograms, Then they do segmentation through such methods as connectivity and compactness, regularity and boundaries, homogeneity in terms of color and texture, and differentiation through neighboring regions. Some of these methods involve region growing, split and merge, k-means clustering, watershed technique. Should this fail there are adaptive thresholding methods such as histogram shape-based methods including Otsu's method which automatically performs shape based threshold to reduce the image to a binary image, clustering-based methods which include fuzzy c-means and k-means, entropy based methods which consist of contrast, energy, and correlation or how a measure of a linked picture relates to the image as a whole, and spatial.

Some projects I'm currently looking at:
Neuromorphic Vision Toolkit (Produced by University of Southern California)
ARToolkit (authored by the University of Washington)
Kinovea (a camera driven app much like ubersense for sports. Although I found some things we can use I like the idea of working with a codebook described by XML)
ipsy (a webcam platform used in combindation with OpenCV. This prodives nice interfaces for working with SVG and has different device drivers for attaching to Windows)
Vuforia uses the Unity 3D and visual studio in C#.

One nice thing about using these projects (Kinovea, ISpy, Vuforia, Unity 3D) is that they all have C# interfaces. The Blender project involves python and KNIME works with Java. This does not mean that I would hesitate to use these projects. Due to the size of the C# tools, I would reflect them to ironpython and from then to python and use Kivy to make them work cross platform.

Works Cited:
"Acoustic Camera." Wikipedia. Wikimedia Foundation, n.d. Web. 08 June 2015.

"Advanced Audio Coding." Wikipedia. Wikimedia Foundation, n.d. Web. 08 June 2015.

"CEM Lectures." YouTube. YouTube, n.d. Web. 08 June 2015.

"Compression Artifact." Wikipedia. Wikimedia Foundation, n.d. Web. 08 June 2015.

"Eric Betzig Plenary Presentation: Single Molecules, Cells, and Super-resolution Optics." YouTube. YouTube, n.d. Web. 08 June 2015.

"Inexpensive 'nano-camera' Can Operate at the Speed of Light." MIT News. N.p., n.d. Web. 08 June 2015.

"Keynote: Preservation and Exhibition of Historical 3D Movies [9011-83] [SD&A 2014]." YouTube. YouTube, n.d. Web. 08 June 2015.

"Korotkoff Sounds." Wikipedia. Wikimedia Foundation, n.d. Web. 08 June 2015.

"Lecture 22 (EM21) -- Slow Waves." YouTube. YouTube, n.d. Web. 08 June 2015.

McCormick, Douglas. "A "Sound Camera" Zeroes In on Buzz, Squeak, and Rattle." N.p., n.d. Web. 08 June 2015.

"Microscopy: Super-Resolution: Overview and Stimulated Emission Depletion (STED) (Stefan Hell)." YouTube. YouTube, n.d. Web. 08 June 2015.

"Projective Geometry." Wikipedia. Wikimedia Foundation, n.d. Web. 08 June 2015.

"Q & A: Speed of Sound and Light." Q & A: Speed of Sound and Light. N.p., n.d. Web. 08 June 2015.

"Rectification of an Oblique Image." 8 June 2015. Speech.

"Short-time Fourier Transform." Wikipedia. Wikimedia Foundation, n.d. Web. 08 June 2015.

Southwall, Richard. "Projective Geometry 1 Without Equations, Conics & Spirals." 8 June 2015.

Srivastava, Rajeev, S. K. Singh, and K. K. Shukla. Research Developments in Computer Vision and Image Processing: Methodologies and Applications. N.p.: n.p., n.d. Print.

"Vuforia Tutorial: Qualcomm's Augmented Reality SDK." YouTube. YouTube, n.d. Web. 08 June 2015.

watch alerts: a physical training platform using computer vision

Monday, June 8, 2015

Limits of perceiving images through light can be remedied in acousting modeling