I try to describe a 3D swipe gesture (only vertical or horizontal, no diagonals) over a certain flat surface, using conventional geometry or similar techniques without machine learning (hidden Markov model, artificial neural networks, etc. are therefore excluded) possible. From several observations of the data retrieved from the device, I concluded that a stroke can be described "slightly" as a curve (or in some cases as a really straight line). With this question, I would like to know how a curve and a curve movement can be described in simple geometric terms in the most efficient (mostly speed, but also memory-related) way.
The article is divided into two parts – one that contains information about the data used and one that gives an overview of what I have come up with so far. Apologize in advance for my poor painting skills. : D
The 3D position data
The device I use transmits 3D points, each representing the position of the hand at a specific point in time. I can record and evaluate them. The following image shows the graphical representation of the data from two different perspectives – from top to bottom and isometric (more or less):
- XY plane view (also known on the left as Top down view) – Only the values along the X and Y axes are taken into account for each sample. This view represents the surface of the device over which the movement of the hand is recorded
- XYZ view (right also known as isometric view) – All three axes are taken into account for each sample. This view represents the complete 3D movement in a volume above the device surface, which defines the area in which gestures can be recognized
In the next picture I added the hand movement recognized by the device:
The actual movement looks something like this:
Based on the observation of the actual and the movement detected by the device, I can mark almost half of the samples that the device gave me as invalid, namely all limit values (a position along each axis can be between 0 and 65534) that this does not describe the actual movement of the hand from the point of view of the user of the device (in the image below, invalid data is shown as the part of the trajectory that is covered by a polygon):
Of course, sometimes the "valid" part of the trajectory is rather small compared to the invalid data:
The algorithm described below does not matter how big the valid data is, as long as there are at least 2 samples that meet the requirement that they are not limit positions, ie X and Y differ from 0 and 65534. This results in a problem I'll go into more detail in the next part of this post.
Describe the movement
I thought about it and came up with the following:
Extract only the set of valid samples that exclude all with an edge position
For each sample, generate a local XY coordinate system that is aligned with the XY coordinate system of the device surface (for simplification :)):
Next, I consider calculating the vector between the current and the next sample (if any) and calculating the angle between this vector and the X axis (this can also be done with the Y axis):
Based on the size of each angle, I can determine whether the movement between the current and the next sample tends more towards the horizontal or vertical and also in which direction.
This should enable me to determine the general direction of the wiping movement as well as the position above the surface. I swiped a lot: D, but since I want to describe this more formally, of course I have to describe my results, so I have to find a way to describe and classify a curve based on its properties. Maybe calculate the curvature of the entire trajectory?
There are, of course, some problems with this algorithm that came to my mind:
I searched online before thinking about creating the algorithm described above, but couldn't find anything. Even the topic of curve classification doesn't seem to be that popular, or the search terms I use are too extensive / restrictive. The classification here is not that important (as opposed to the following ones), but it would still be nice to divide the resulting curves into sets, each representing a swipe gesture.
The next thing I thought about is curve fitting. I've read articles about it, but apart from a few assignments at my university during the math course, I was only concerned about Bezier curves. Can someone tell me if curve fitting is a plausible solution for my case? There it is curve fitting One might rightly assume that we need an initial curve against which we want to adapt. This would require the detection of wiping motions and then the extraction of a possible optimal curve which is something like an "average" of all curves for a given wiping. I can use the first algorithm I described above to get a compact description of a curve, and then save and analyze multiple curves for a given beat to get the "perfect" curve. How do you proceed with curve classification?