Computation of cross-covariance of state and action

From only looking at the docstrings of the relevant functions, I think I noticed a discrepancy to the paper. I am writing this without checking the math in the code so I may be wrong.

`V` returned in `RbfController.compute_action()` in `controllers.py`
corresponds to Cov[x,u]

From backtracking to `MGPR.predict_given_factorizations()` in `models/mgpr.py`, I think the docstrings indicate that:

**V = cov[x,x]^{-1} @ cov[x,pi] @ cov[pi,u]**

where I call pi the action before squashing

From section 5.5 of the 2015 paper, it says:

**V = cov[x,pi] @ cov[pi,pi]^{-1} @ cov[pi,u]**

Are these expressions equivalent or have I misread something. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Computation of cross-covariance of state and action #35

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Computation of cross-covariance of state and action #35

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions