The results showing feature selection and subsequent estimation performance of LSTM, XGBoost and LinReg using 3DMoCap and 2D-DV data, are presented as RMSE in Table 2 and \(R^2\) in Fig. 7. Figure 8 shows illustrative example graphs of estimation performance of the three models using 3D and 2D data, over a randomly selected sequence (1000 frames) from one person during one trial of play.
Furthermore, the contribution of each joint center to estimation performance was computed using a permutation procedure. Here, the data in each feature is shuffled in a random manner, which breaks the real-world relationship between the feature and the target. The resulting difference in estimation performance between using the shuffled and un-shuffled feature is indicative of how much the model depends on this feature [35]. This is then repeated for all features, and inform about which features, i.e. joint centers, are most important to the estimation performance. Results from the feature importance analysis, using 3DMoCap data, showed that eight joint centers contributed with 82.9% of the information needed to estimate GRF components. These joint centers were right and left wrist, right elbow, left knee, and torso joint centers (left and right shoulders, and left and right hip joints). The models were subsequently retrained using these joints.
Using 2D-DV data, there were also eight joint centers that had a total contribution of 78%: Left wrist, shoulder, hip, knee and ankle, and right shoulder, knee, and ankle. The relative contributions of all joint centers can be seen in Figs. 5 and 6.

Overview of the joint centers’ total impact (fraction of \(R^2\)) on estimation performance when using 3DMoCap data

Overview of the joint centers’ total impact (fraction of \(R^2\)) on estimation performance when using 2D-DV data
Estimation error
Prediction performance is presented in Table 2, with the mean (± 1SD) RMSE (% BW) for the three models using 3DMoCap and 2D-DV data for the three force components. The LSTM model outperforms both XGBoost and LinReg when using both 3DMoCap and 2D-DV data. The XGBoost model achieves at the same level as LinReg using both 3DMoCap and 2D-DV data. Lowest mean RMSE (4.3% BW) was achieved by the LSTM model on the \(F_z\) component using 3DMoCap data; highest (23.5% BW) was the LinReg model in the \(F_y\) component using 2D-DV data. RMSE was generally higher using 2D-DV data than when using 3DMoCap data.
Model fit
As shown in Fig. 7, the LSTM \(R^2\) is consistently higher than in the XGBoost and LinReg model using both MoCap and 2D-DV data. Using the MoCap data, the mean (± 1SD) LSTM \(R^2\) was .589 (.34), .796 (.31), and .971 (.05) in the \(F_x\), \(F_y\), and \(F_z\) components, respectively, and XGBoost \(R^2\) was − .246 (.27), .114 (.36), and .863 (.16), respectively. The LinReg model achieved a mean \(R^2\) of − .168 (.28), − .054 (.21), and .856 (.17), respectively. Using 2D-DV data, all models achieved slightly lower \(R^2\). LSTM achieved mean (± 1SD) \(R^2\) of .379 (.55) in \(F_x\), .579 (.58) in \(F_y\) and .770 (.45) in \(F_z\). XGBoost mean (± 1SD) \(R^2\) in \(F_x\) was − .313 (.26), − .234 (.53) in \(F_y\), and .564 .(.31) in \(F_z\). Here, the LinReg results were mean (± 1SD) \(R^2\) of − .266 (.39), − .950 (2.23), and .617 (.28) for the \(F_x\), \(F_y\), and \(F_z\) components, respectively.

Box plots showing median \(R^2\) from LSTM, XGBoost, and LinReg models in all three GRF components. A Results using 2D-DV data, and B MoCap data
Estimation plots
In Fig. 8, example plots from the left foot are presented that show the estimated component values by the XGBoost, LSTM, and LinReg models over a random set of 1000 frames, along with the ground truth component values. The LSTM model estimates all three components very well, both using MoCap and 2D-DV data. The \(F_x\) component seems to be the least accurate, although the LSTM model estimates the major changes in BW here as well. The XGBoost model also estimates \(F_z\) very well, but this is not seen to the same degree in \(F_y\) and \(F_x\). In \(F_x\) and \(F_y\) the XGBoost model is able to follow the major trends in the data, but rapid changes in force are not estimated well. The LinReg model is able to estimate major changes in \(F_z\), but not with the level of detail seen in the LSTM or XGBoost model. \(F_x\) and \(F_y\) components, however, are not estimated as well by the LinReg model.

Example of estimation performance on the left side from XGBoost (green), LSTM (blue), and LinReg (orange) models in each GRF component over 1000 frames, along with the ground truth GRF (black). A, C, and E Results from one 2D-DV dataset, and B, D and F One MoCap dataset
Test/train error
The LSTM and XGBoost RMSE from using test and train data is presented in Figs. 9 and 10, respectively. The test error for the XGBoost model is consistently about 3\(\times\) higher than the train error, using both 3DMoCap and 2D-DV data. For XGBoost, the mean (±SD) train/test RMSE was 8.8 (.5)/19.8 (3.7) %BW, respectively, using 2D-DV data and 5.2 (.1)/15.9 (3.0) %BW, respectively, using 3DMoCap data. For LSTM, the mean (±SD) train/test RMSE using 2D-DV data was 9.8 (.6)/11.5 (7.6) %BW, respectively, and 11.85 (.2)/13.6 (2.9) %BW, respectively, using 3DMoCap data.

XGBoost model test/train RMSE (%BW) from each cross-validation iteration using 3DMoCap and 2D-DV data

LSTM model test/train RMSE (%BW) from each cross-validation iteration using 2D-DV (A) and 3DMoCap (B) data