Skip to content

Improve caching of equivalent variables in analyser model.#1389

Open
hsorby wants to merge 3 commits intocellml:mainfrom
hsorby:cache-equiv-vars
Open

Improve caching of equivalent variables in analyser model.#1389
hsorby wants to merge 3 commits intocellml:mainfrom
hsorby:cache-equiv-vars

Conversation

@hsorby
Copy link
Copy Markdown
Contributor

@hsorby hsorby commented Apr 27, 2026

@hsorby hsorby requested review from agarny and nickerso April 27, 2026 22:57
Copy link
Copy Markdown
Contributor

@agarny agarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before reviewing further, I am going to wait for PR #1390 (for issue #1379) to be reviewed and merged in since it affects the caching code.

Comment on lines +603 to +610
* This test is a cached version of the internal areEquivalentVariables() utility.
* The cache is built when the model is analysed.
* If the model has changed since it was analysed, then the cache might not be valid
* and the result of this test may be incorrect.
*
* This function is intended to be used by the @ref Analyser whilst it is analysing the model.
* It is not intended to be used by external code.
* Although, we do not restrict the use of this function in case it does have other applications.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People who look at the API will not know (and don't need to know) about our internal areEquivalentVariables() utility, so I would remove that mention.

Also, don't insert new lines within a paragraph (i.e. between lines 603 & 604, 604 & 605, 608 & 609, and 609 & 610).

Copy link
Copy Markdown
Contributor

@agarny agarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and definitely faster! Nice!

AnalyserModel::AnalyserModelImpl::buildEquivalentVariablesCache() should probably be called in AnalyserModel::areEquivalentVariables() (if the cache hasn't already been built), but having said that we know that the analyser is going to call AnalyserModel::areEquivalentVariables() many times, so I guess it's fine to call AnalyserModel::AnalyserModelImpl::buildEquivalentVariablesCache() upon retrieving the analyser model?

Comment thread src/analyser.cpp
Comment on lines +2327 to +2328
mAnalyserModel->mPimpl->buildEquivalentVariablesCache();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this to AnalyserModel::AnalyserModelImpl::create(), i.e.

AnalyserModelPtr AnalyserModel::AnalyserModelImpl::create(const ModelPtr &model)
{
    auto res = std::shared_ptr<AnalyserModel> {new AnalyserModel(model)};

    res->mPimpl->buildEquivalentVariablesCache();

    return res;
}

Comment thread src/analysermodel.cpp
Comment on lines +48 to +52
mEquivalentVariableCache.clear();
for (size_t i = 0; i < mModel->componentCount(); ++i) {
buildEquivalentVariablesCache(mModel->component(i));
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer my code not to be compact, i.e.

    mEquivalentVariableCache.clear();

    for (size_t i = 0; i < mModel->componentCount(); ++i) {
        buildEquivalentVariablesCache(mModel->component(i));
    }

Comment thread src/analysermodel.cpp
Comment on lines +56 to +70
for (size_t i = 0; i < component->variableCount(); ++i) {
auto variable = component->variable(i);
for (size_t j = 0; j < variable->equivalentVariableCount(); ++j) {
auto equivalentVariable = variable->equivalentVariable(j);
auto v1 = reinterpret_cast<uintptr_t>(variable.get());
auto v2 = reinterpret_cast<uintptr_t>(equivalentVariable.get());
if (v2 < v1) {
std::swap(v1, v2);
}
unite(v1, v2);
}
}
for (size_t i = 0; i < component->componentCount(); ++i) {
buildEquivalentVariablesCache(component->component(i));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer my code not to be compact, i.e.

    for (size_t i = 0; i < component->variableCount(); ++i) {
        auto variable = component->variable(i);

        for (size_t j = 0; j < variable->equivalentVariableCount(); ++j) {
            auto equivalentVariable = variable->equivalentVariable(j);
            auto v1 = reinterpret_cast<uintptr_t>(variable.get());
            auto v2 = reinterpret_cast<uintptr_t>(equivalentVariable.get());

            if (v2 < v1) {
                std::swap(v1, v2);
            }

            unite(v1, v2);
        }
    }

    for (size_t i = 0; i < component->componentCount(); ++i) {
        buildEquivalentVariablesCache(component->component(i));
    }

Comment thread src/analysermodel.cpp
// means that we can safely cache the result of a call to that utility. In
// turn, this means that we can speed up any feature (e.g., code generation)
// that also relies on that utility.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put that empty line back.

Comment thread src/analysermodel_p.h
std::vector<AnalyserVariablePtr> mComputedConstants;
std::vector<AnalyserVariablePtr> mAlgebraicVariables;
std::vector<AnalyserVariablePtr> mExternalVariables;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put that empty line back. Different group of variables.

Comment thread src/analysermodel.cpp
mPimpl->mCachedEquivalentVariables.emplace(key, res);

return res;
return mPimpl->find(v1) == mPimpl->find(v2);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the API, so we probably want some safeguards. So, maybe have the following:

bool AnalyserModel::areEquivalentVariables(const VariablePtr &variable1,
                                           const VariablePtr &variable2)
{
    // This is a cached version of the areEquivalentVariables() utility. Indeed,
    // an AnalyserModel object refers to a static version of a model, which
    // means that we can safely cache the result of a call to that utility. In
    // turn, this means that we can speed up any feature (e.g., code generation)
    // that also relies on that utility.

    if (variable1 == variable2) {
        return true;
    }

    if ((variable1 == nullptr) || (variable2 == nullptr)) {
        return false;
    }

    const auto v1 = reinterpret_cast<uintptr_t>(variable1.get());
    const auto v2 = reinterpret_cast<uintptr_t>(variable2.get());

    return mPimpl->find(v1) == mPimpl->find(v2);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could have those tests here and then have the "meat" of the method in the private implementation, which could be called directly by the analyser to avoid those safeguards.

Comment thread src/analysermodel_p.h
Comment on lines -49 to +70
struct VariableKeyPair
{
uintptr_t first;
uintptr_t second;
std::unordered_map<uintptr_t, uintptr_t> mEquivalentVariableCache;

bool operator==(const VariableKeyPair &other) const
{
return (first == other.first) & (second == other.second);
uintptr_t find(uintptr_t x)
{
auto it = mEquivalentVariableCache.find(x);
if (it == mEquivalentVariableCache.end()) {
mEquivalentVariableCache[x] = x;
return x;
}
};
if (it->second != x) {
it->second = find(it->second);
}
return it->second;
}

struct VariableKeyPairHash
void unite(uintptr_t x, uintptr_t y)
{
size_t operator()(const VariableKeyPair &pair) const
{
// A simple and portable hash function for a pair of pointers.

size_t hash = pair.first;

hash ^= pair.second + 0x9e3779b9 + (hash << 6) + (hash >> 2);

return hash;
const uintptr_t &rootX = find(x);
const uintptr_t &rootY = find(y);
if (rootX != rootY) {
mEquivalentVariableCache[rootY] = rootX;
}
};

std::unordered_map<VariableKeyPair, bool, VariableKeyPairHash> mCachedEquivalentVariables;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put some empty lines here and there:

    std::unordered_map<uintptr_t, uintptr_t> mEquivalentVariableCache;

    uintptr_t find(uintptr_t x)
    {
        auto it = mEquivalentVariableCache.find(x);

        if (it == mEquivalentVariableCache.end()) {
            mEquivalentVariableCache[x] = x;

            return x;
        }

        if (it->second != x) {
            it->second = find(it->second);
        }

        return it->second;
    }

    void unite(uintptr_t x, uintptr_t y)
    {
        const auto rootX = find(x);
        const auto rootY = find(y);

        if (rootX != rootY) {
            mEquivalentVariableCache[rootY] = rootX;
        }
    }

Comment thread src/analysermodel_p.h
Comment on lines +65 to +66
const uintptr_t &rootX = find(x);
const uintptr_t &rootY = find(y);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be replaced with:

        const auto rootX = find(x);
        const auto rootY = find(y);

Comment thread src/analysermodel_p.h
Comment on lines +101 to +102
void buildEquivalentVariablesCache();
void buildEquivalentVariablesCache(const ComponentPtr &component);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would swap them, i.e. start with the "specialised" version and finish with the general version (same in the implementation).

Comment thread src/analysermodel.cpp
Comment on lines 527 to 528
auto v1 = reinterpret_cast<uintptr_t>(variable1.get());
auto v2 = reinterpret_cast<uintptr_t>(variable2.get());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can actually be replaced with:

    const auto v1 = reinterpret_cast<uintptr_t>(variable1.get());
    const auto v2 = reinterpret_cast<uintptr_t>(variable2.get());

Comment thread src/analysermodel_p.h
bool operator==(const VariableKeyPair &other) const
{
return (first == other.first) & (second == other.second);
uintptr_t find(uintptr_t x)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to use a different name? Maybe findVariableAddres()?

Comment thread src/analysermodel_p.h
}

struct VariableKeyPairHash
void unite(uintptr_t x, uintptr_t y)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to use a different name? Maybe uniteVariableAddreses()?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants