Skip to content

feat: add get_fcidump to output#253

Open
haneug wants to merge 4 commits into
faccts:mainfrom
haneug:feature/output-fcidump-parser
Open

feat: add get_fcidump to output#253
haneug wants to merge 4 commits into
faccts:mainfrom
haneug:feature/output-fcidump-parser

Conversation

@haneug

@haneug haneug commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Closes Issues

Closes None

Description

  • Added a small helper class that parses the FCIDUMP file and is able to return the important quantities (one and two-electron integrals) as numpy arrays. This object is populated and returned by output.get_fcidump.

Release Notes

Added

  • Added get_fcidump for easy access to fcidump file properties.

@haneug haneug added this to the 3.0.0 milestone Jun 19, 2026
@haneug haneug self-assigned this Jun 19, 2026
@haneug haneug added enhancement New feature or request side output Concerning parsing ORCA output labels Jun 19, 2026
@haneug haneug marked this pull request as ready for review June 19, 2026 06:53
@haneug haneug requested a review from a team as a code owner June 19, 2026 06:53
@haneug haneug force-pushed the feature/output-fcidump-parser branch from a2b39ef to 17a2832 Compare June 19, 2026 07:38
Comment thread src/opi/output/core.py
"""
Parse the fcidump file generated by ORCA and return its data in the `Fcidump` data class.
The fcidump file has to be generated by the ORCA job and cannot be generated on-the-fly after the calculation.
ORCA can generate fcidump files via the `dumpactints true` flag in the `%output` block.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ORCA can generate fcidump files via the `dumpactints true` flag in the `%output` block.
To generate FCIDUMP files, set the following options:
```
%output
dumpactints true
end
```

IMO this reads more concise.

Comment thread src/opi/output/core.py

def get_fcidump(self) -> Fcidump | None:
"""
Parse the fcidump file generated by ORCA and return its data in the `Fcidump` data class.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Parse the fcidump file generated by ORCA and return its data in the `Fcidump` data class.
Parse the FCIDUMP file generated by ORCA and return its data in the `Fcidump` data class.

Comment thread src/opi/output/core.py
def get_fcidump(self) -> Fcidump | None:
"""
Parse the fcidump file generated by ORCA and return its data in the `Fcidump` data class.
The fcidump file has to be generated by the ORCA job and cannot be generated on-the-fly after the calculation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The fcidump file has to be generated by the ORCA job and cannot be generated on-the-fly after the calculation.
The FCIDUMP file has to be generated by the ORCA job and cannot be generated on-the-fly after the calculation.

Comment thread src/opi/output/core.py
Returns
-------
fcidump_data: Fcidump | None
The parsed fcidump data or None if the file is not present or could not be parsed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The parsed fcidump data or None if the file is not present or could not be parsed.
The parsed FCIDUMP data or None if the file is not present or could not be parsed.

@@ -0,0 +1,161 @@
"""Parse a potential FCIDUMP file"""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A module that contains one primary class, should also be named after that class -> fcidump.py

return tensor

@classmethod
def parse_fcidump(cls, path: Path | str) -> "Fcidump":

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. IMO from_file() is a more accessible name than parse_fcidump().
  2. Are the FCIDUMP files documented anywhere? Could add the link to the docstring, if so.

@classmethod
def _get_int(cls, key: str, header: str) -> int:
"""Return the integer value of the given key."""
m = re.search(rf"{key}\s*=\s*(\d+)", header, re.IGNORECASE)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about negative numbers?
If these lines always look as follows:

key    =     1234   [END OF LINE]

You could use .partition("=")

@classmethod
def _get_int_list(cls, key: str, header: str) -> list[int]:
"""Return a list of integers corresponding to the given key."""
m = re.search(rf"{key}\s*=\s*([\d,\s]+)", header, re.IGNORECASE)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
m = re.search(rf"{key}\s*=\s*([\d,\s]+)", header, re.IGNORECASE)
m = re.search(rf"{key}\s*=\s*(\d+(\s*,\s*\d+)+)", header, re.IGNORECASE)

What about a more precise regex!?
This still does not account for leading plus/minus symbol.

@@ -0,0 +1,79 @@
import textwrap

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add units test for _get_int_list() and _get_int() specifically.


@pytest.mark.unit
def test_parse_fcidump_header(tmp_path: Path) -> None:
fcidump_text = textwrap.dedent("""\

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually not remove the indentation.
If a file format does not rely on indentation, then our file parser shouldn't as well. And the current also implementation does (good implementation 👍)
So I would actively test that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request side output Concerning parsing ORCA output

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants