VideoProcessingTools/Docs/Help/extract_face_data.txt at main · DFKI-SignLanguage/VideoProcessingTools · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
usage: extract_face_data.py [-h] --inframes INFRAMES --outlandmarks
                            OUTLANDMARKS
                            [--outnosetipposition OUTNOSETIPPOSITION]
                            [--outfacerotation OUTFACEROTATION]
                            [--outfacescale OUTFACESCALE]
                            [--outblendshapes OUTBLENDSHAPES]
                            [--outcompositeframes OUTCOMPOSITEFRAMES]
                            [--normalize-landmarks]

Uses mediapipe to extract the face mesh data from the frames of a video.

options:
  -h, --help            show this help message and exit
  --inframes INFRAMES, --invideo INFRAMES
                        Path to a video or image directory providing the
                        frames with the face of a person.
  --outlandmarks OUTLANDMARKS
                        Path to the output numpy array of size [N][468][3],
                        where N is the number of video frames, 478 are the
                        number of landmarks of the
                        [MediaPipe](https://mediapipe.dev) face mesh, and 3 is
                        to store <x,y,z> 3D coords. Inside a frame, if no
                        faces are detected, all values are NaN! If more faces
                        are detected, only the first in the mediapipe list is
                        used.
  --outnosetipposition OUTNOSETIPPOSITION
                        Path to an output numpy array of shape [N][3] with the
                        x,y,z movement of the nose tip in space. N is the
                        number of video frames As for MediaPipe, X and Y
                        coordinates are normalized in the range [0,1] in the
                        frame size.
  --outfacerotation OUTFACEROTATION
                        Path to an output numpy array of shape [N][3][3] with
                        the 3x3 rotation of the face. N is the number of video
                        frames
  --outfacescale OUTFACESCALE
                        Path to the output numpy array of shape [N] with the
                        scaling of the face. N is the number of video frames.
                        The scaling factor needed to resize the vertical
                        distance within ear and jaw-base into 10 percent of
                        the height of the frame.
  --outblendshapes OUTBLENDSHAPES
                        Path to the output numpy array of shape [N][52] with
                        the blendshape activation values. N is the number of
                        video frames. Inside a frame, if no faces are
                        detected, all values are NaN! If more faces are
                        detected, only the first in the mediapipe list is
                        used. This is the list of blendshapes:
                        MP_BLENDSHAPES=["_neutral", "browDownLeft",
                        "browDownRight", "browInnerUp", "browOuterUpLeft",
                        "browOuterUpRight", "cheekPuff", "cheekSquintLeft",
                        "cheekSquintRight", "eyeBlinkLeft", "eyeBlinkRight",
                        "eyeLookDownLeft", "eyeLookDownRight",
                        "eyeLookInLeft", "eyeLookInRight", "eyeLookOutLeft",
                        "eyeLookOutRight", "eyeLookUpLeft", "eyeLookUpRight",
                        "eyeSquintLeft", "eyeSquintRight", "eyeWideLeft",
                        "eyeWideRight", "jawForward", "jawLeft", "jawOpen",
                        "jawRight", "mouthClose", "mouthDimpleLeft",
                        "mouthDimpleRight", "mouthFrownLeft",
                        "mouthFrownRight", "mouthFunnel", "mouthLeft",
                        "mouthLowerDownLeft", "mouthLowerDownRight",
                        "mouthPressLeft", "mouthPressRight", "mouthPucker",
                        "mouthRight", "mouthRollLower", "mouthRollUpper",
                        "mouthShrugLower", "mouthShrugUpper",
                        "mouthSmileLeft", "mouthSmileRight",
                        "mouthStretchLeft", "mouthStretchRight",
                        "mouthUpperUpLeft", "mouthUpperUpRight",
                        "noseSneerLeft", "noseSneerRight", ]
  --outcompositeframes OUTCOMPOSITEFRAMES, --outcompositevideo OUTCOMPOSITEFRAMES
                        Path to a videofile or directory for image files. Will
                        have the same resolution and content of the input
                        frames, plus the overlay of the face landmarks. The
                        blue landmarks are printed by mediapipe. The red
                        landmarks, possibly normalized, and printed in the
                        upper-left quadrant, are the outputted values
  --normalize-landmarks
                        If specified, neutralizes the head translation,
                        rotation, and zoom. At each frame, a counter-rotation,
                        -translation, and -scaling are applied in order to
                        have: face nose facing the camera and head-up, nose
                        tip at the center of the frame, head of the same size.