Releases: xflops/flame
Releases · xflops/flame
v0.6.0
Highlights
- Added the Python Runner API for packaging local Python projects and exposing functions, classes, or object instances as Flame services.
- Added
flmrun, a built-in Runner template application used by Runner-managed services. - Fixed Runner package downloads for multi-object-cache deployments by preserving the cache endpoint returned from package upload.
- Reworked the Python SDK around synchronous client APIs, object-cache helpers, service sessions, and the
flamepy.runnermodule. - Simplified the Rust SDK with top-level connection/session helpers, typed task messages, host-service helpers, object-cache helpers, and service macros.
- Added a standalone object cache with versioned object references, pluggable storage, upload/download support, incremental object fetch, and fast paths for tabular/numpy/Arrow data.
- Added
flmadminstallation profiles and support for installing Flame components, SDKs, examples, and multiple Python SDK versions. - Added
flmctl deployand object helper commands for application deployment and object-cache workflows. - Added a Helm chart and Kind-based Kubernetes E2E workflow for static Flame installs on Kubernetes.
- Updated Helm/Kubernetes defaults for filesystem session-manager storage and the
drf/gangpolicy set. - Fixed Flame CLI version metadata so binaries report their package version.
- Added scheduler policy work for priority scheduling, GPU-aware DRF, dynamic policy configuration, gang/batch flows, and resource requirements.
- Added session/runtime reliability improvements including TLS, executor recovery, session bind failure recovery, task watching, node/executor persistence, and notifier-based wakeups.
- Expanded system, E2E, Runner, cache, Python SDK, Rust SDK, and storage test coverage.
Upgrade Notes
- Python SDK packaging is now versioned as
0.6.0, requires Python 3.9 or newer, and advertises Python 3.9 through 3.12. - Runner imports should use
flamepy.runner; olderflamepy.rlnaming was replaced. flmadm installnow requires an explicit profile flag such as--all,--control-plane,--worker,--cache, or--client.- Cluster policy configuration now supports multiple policies through the
policieslist. - The Helm chart defaults now configure session-manager storage as
fs:///var/lib/flame/sessionand enable only thedrfandgangpolicies. - Object cache references are versioned. Clients can use
get_object,update_object,patch_object,upload_object, anddownload_objectinstead of passing raw object payloads through sessions. flmexecsupports explicit runtime selection for scripts, including Python runtime selection throughFLAME_PYTHON_VERSION. Its default Python runtime policy is owned byflmexecrather than the Rust SDK.
Python SDK And Runner
- Added
flamepy.Runnerandflamepy.runnerto package local projects and run Python execution objects remotely (#298, #341, #342). - Added Runner helpers for waiting, selecting, resolving object references, and fetching result objects (#300, #363).
- Added explicit Runner defaults for stateless functions/classes and stateful object instances (#474).
- Fixed Runner packaging for ad hoc scripts and dependency metadata generation (#471).
- Fixed Runner package URL generation to use the returned object-cache endpoint, so executor-manager downloads packages from the cache pod that stored them in multi-cache deployments (#482).
- Added support for Runner dependencies and Python-version selection in Runner/flmexec workflows (#468, #469, #473).
- Added
flmrunas a built-in application template for Runner services (#285). - Added Python service-session helpers and renamed the FlamePy agent API to service sessions (#454).
- Added custom session IDs and
open_sessionenhancements for Python clients and Runner services (#351, #353, #354). - Added
Session.list_tasksand task watching improvements for Python clients (#359, #371). - Aligned
flamepypackage metadata for the 0.6 release, including__version__, Python 3.9+ tooling targets, and Python 3.12 classifier support (#477).
Rust SDK
- Simplified the
flame-rsAPI with direct helpers for connecting, creating/opening sessions, running typed tasks, and writing services (#456). - Added typed service macros and typed task payload support through
FlameMessage(#456). - Added Rust object-cache helpers for putting, getting, updating, patching, uploading, downloading, and deleting objects (#427, #430, #456).
- Kept
flmexecPython runtime default policy local toflmexecand removed that default from the public Rust SDK surface (#475). - Fixed Rust SDK logger initialization when
RUST_LOGis unset (#343).
Object Cache And Storage
- Added the standalone
flame-object-cacheservice and object-cache client helpers (#244, #321). - Added common-data/object-cache integration and object references for shared session state (#258, #259, #269, #296).
- Added LRU eviction and per-application cache behavior (#367, #358).
- Added pluggable cache storage backends, object versioning, and a standalone cache binary (#419, #427).
- Added cache upload/download support with multi-scheme downloaders (#430).
- Added incremental object get, native tabular cache path, and fast-path serialization for numpy/Arrow data (#444, #446, #464).
- Added filesystem, HTTP, and none storage engines for session manager and package/cache workflows (#339, #344, #377, #395).
Scheduling And Runtime Management
- Added batch-session support and gang-related E2E coverage (#401, #407).
- Added priority scheduling and dynamic scheduler policy configuration (#428, #432).
- Added GPU-aware DRF scheduling with resource requirements (#434).
- Added resource requirement support in the priority plugin (#442).
- Added configurable scheduling interval and moved executor limits into
limits(#373, #396). - Added application URL support, installer metadata, and application cache-key improvements (#287, #421, #425).
CLI, Installation, And Local Development
- Added
flmadmfor installation, profile-based component selection, systemd integration, user-local installs, examples, and uninstall flows (#334, #338, #421, #468). - Added
flmctl deployand object helper commands (#459). - Added JSON output for session commands and improved CLI display behavior (#215, #439).
- Changed Flame CLI metadata to derive
--versionoutput from package versions and useXFLOPS <support@xflops.io>as the author (#488). - Added Podman support and local development helpers (#212).
- Added Docker, compose, and local system-test workflows for multi-component clusters (#345, #347, #466, #470, #472).
Kubernetes And Helm
- Added
charts/flame, a static Helm chart for installing Flame's session manager, executor manager, object cache, client configuration, ServiceAccount, services, persistent volum...
v0.6.0-rc2
v0.6.0-rc1
Highlights
- Added the Python Runner API for packaging local Python projects and exposing functions, classes, or object instances as Flame services.
- Added
flmrun, a built-in Runner template application used by Runner-managed services. - Reworked the Python SDK around synchronous client APIs, object-cache helpers, service sessions, and the
flamepy.runnermodule. - Simplified the Rust SDK with top-level connection/session helpers, typed task messages, host-service helpers, object-cache helpers, and service macros.
- Added a standalone object cache with versioned object references, pluggable storage, upload/download support, incremental object fetch, and fast paths for tabular/numpy/Arrow data.
- Added
flmadminstallation profiles and support for installing Flame components, SDKs, examples, and multiple Python SDK versions. - Added
flmctl deployand object helper commands for application deployment and object-cache workflows. - Added a Helm chart and Kind-based Kubernetes E2E workflow for static Flame installs on Kubernetes.
- Added scheduler policy work for priority scheduling, GPU-aware DRF, dynamic policy configuration, gang/batch flows, and resource requirements.
- Added session/runtime reliability improvements including TLS, executor recovery, session bind failure recovery, task watching, node/executor persistence, and notifier-based wakeups.
- Expanded system, E2E, Runner, cache, Python SDK, Rust SDK, and storage test coverage.
Upgrade Notes
- Python SDK packaging is now versioned as
0.6.0, requires Python 3.9 or newer, and advertises Python 3.9 through 3.12. - Runner imports should use
flamepy.runner; olderflamepy.rlnaming was replaced. flmadm installnow requires an explicit profile flag such as--all,--control-plane,--worker,--cache, or--client.- Cluster policy configuration now supports multiple policies through the
policieslist. - Object cache references are versioned. Clients can use
get_object,update_object,patch_object,upload_object, anddownload_objectinstead of passing raw object payloads through sessions. flmexecsupports explicit runtime selection for scripts, including Python runtime selection throughFLAME_PYTHON_VERSION. Its default Python runtime policy is owned byflmexecrather than the Rust SDK.
Python SDK And Runner
- Added
flamepy.Runnerandflamepy.runnerto package local projects and run Python execution objects remotely (#298, #341, #342). - Added Runner helpers for waiting, selecting, resolving object references, and fetching result objects (#300, #363).
- Added explicit Runner defaults for stateless functions/classes and stateful object instances (#474).
- Fixed Runner packaging for ad hoc scripts and dependency metadata generation (#471).
- Added support for Runner dependencies and Python-version selection in Runner/flmexec workflows (#468, #469, #473).
- Added
flmrunas a built-in application template for Runner services (#285). - Added Python service-session helpers and renamed the FlamePy agent API to service sessions (#454).
- Added custom session IDs and
open_sessionenhancements for Python clients and Runner services (#351, #353, #354). - Added
Session.list_tasksand task watching improvements for Python clients (#359, #371). - Aligned
flamepypackage metadata for the 0.6 release, including__version__, Python 3.9+ tooling targets, and Python 3.12 classifier support (#477).
Rust SDK
- Simplified the
flame-rsAPI with direct helpers for connecting, creating/opening sessions, running typed tasks, and writing services (#456). - Added typed service macros and typed task payload support through
FlameMessage(#456). - Added Rust object-cache helpers for putting, getting, updating, patching, uploading, downloading, and deleting objects (#427, #430, #456).
- Kept
flmexecPython runtime default policy local toflmexecand removed that default from the public Rust SDK surface (#475). - Fixed Rust SDK logger initialization when
RUST_LOGis unset (#343).
Object Cache And Storage
- Added the standalone
flame-object-cacheservice and object-cache client helpers (#244, #321). - Added common-data/object-cache integration and object references for shared session state (#258, #259, #269, #296).
- Added LRU eviction and per-application cache behavior (#367, #358).
- Added pluggable cache storage backends, object versioning, and a standalone cache binary (#419, #427).
- Added cache upload/download support with multi-scheme downloaders (#430).
- Added incremental object get, native tabular cache path, and fast-path serialization for numpy/Arrow data (#444, #446, #464).
- Added filesystem, HTTP, and none storage engines for session manager and package/cache workflows (#339, #344, #377, #395).
Scheduling And Runtime Management
- Added batch-session support and gang-related E2E coverage (#401, #407).
- Added priority scheduling and dynamic scheduler policy configuration (#428, #432).
- Added GPU-aware DRF scheduling with resource requirements (#434).
- Added resource requirement support in the priority plugin (#442).
- Added configurable scheduling interval and moved executor limits into
limits(#373, #396). - Added application URL support, installer metadata, and application cache-key improvements (#287, #421, #425).
CLI, Installation, And Local Development
- Added
flmadmfor installation, profile-based component selection, systemd integration, user-local installs, examples, and uninstall flows (#334, #338, #421, #468). - Added
flmctl deployand object helper commands (#459). - Added JSON output for session commands and improved CLI display behavior (#215, #439).
- Added Podman support and local development helpers (#212).
- Added Docker, compose, and local system-test workflows for multi-component clusters (#345, #347, #466, #470, #472).
Kubernetes And Helm
- Added
charts/flame, a static Helm chart for installing Flame's session manager, executor manager, object cache, client configuration, ServiceAccount, services, persistent volumes, and Helm test resources (#479). - Added chart values and schema coverage for images, service ports, storage, runtime volumes, TLS Secret mounting, client config, component enablement, and static object-cache replica counts (#479).
- Added a Kind-based Kubernetes E2E workflow that builds Flame images, installs the Helm chart, runs Helm tests, and verifies
flmctl,flmping, and the Python Pi Runner example from an in-cluster console pod (#479).
Reliability, Recovery, And Observability
v0.5.0
What's Changed
- Bump up the versions. by @k82cn in #134
- New flmexec arch. by @k82cn in #135
- Run py script by flmexec by @k82cn in #136
- chore: add sdk go for mcp. by @k82cn in #137
- fix: update readme for elastic workload by @k82cn in #138
- fix: retry FEM in CI. by @k82cn in #139
- chore: add UT about tasks for Go SDK. by @k82cn in #140
- chore: add flame mcp for agent. by @k82cn in #143
- chore: add UT to run multi tasks in one session. by @k82cn in #144
- chore: replace TPC with Unix socket for shim/svc. by @k82cn in #147
- chore: add python sdk and Agent example. by @k82cn in #146
- chore: create session in py. by @k82cn in #148
- chore: add openai agent example. by @k82cn in #149
- chore: add AI Agent blog. by @k82cn in #150
- chrose: run LLM code via Flame. by @k82cn in #151
- docs: add run LLM script blog. by @k82cn in #153
- fix: rm stdio/shell shim. by @k82cn in #154
- chore: add delay_release & max_instances. by @k82cn in #156
- chore: add description & labels to application. by @k82cn in #157
- chore: add executor manager. by @k82cn in #159
- fix: update flame arch. by @k82cn in #160
- fix: update flame arch. by @k82cn in #161
- chore: add input, output and common_data schema to application. by @k82cn in #163
- fix: validate schema of input/output/common_data. by @k82cn in #164
- Update to flamepy by @k82cn in #165
- Add py e2e. by @k82cn in #166
- fix: update fsi socket path. by @k82cn in #167
- chore: add FlameInstance. by @k82cn in #169
- fix: fix empty schema error. by @k82cn in #170
- chore: add langchain example. by @k82cn in #171
- chore: add spider example. by @k82cn in #172
- chore: rename shims. by @k82cn in #173
- chore: add e2e framework. by @k82cn in #174
- chore: add test application e2e. by @k82cn in #175
- fix: fix unregister app issue. by @k82cn in #176
- fix: if session closed, did not launch task. by @k82cn in #177
- fix: remove Go deps. by @k82cn in #178
- chore: add e2e app. by @k82cn in #179
- chore: add multi-ssn for langchain. by @k82cn in #180
- chore: add cri-rs for cri shim. by @k82cn in #181
- chore: add update application. by @k82cn in #182
- fix: provide more error message. by @k82cn in #183
- chore: replace log/env_logger with tracing. by @k82cn in #184
- chore: return error/exception to client. by @k82cn in #185
- chore: add events for ssn/tasks. by @k82cn in #186
- fix: enhance e2e ci. by @k82cn in #187
- fix: enhance multi-task e2e. by @k82cn in #188
- fix: enable benchmark in e2e. by @k82cn in #189
- fix: update flame version. by @k82cn in #190
- fix: update version to 0.5.0 by @k82cn in #191
- fix: update quick start in readme. by @k82cn in #192
- chore: add crawler example. by @k82cn in #193
- chore: enhance flmping for debug. by @k82cn in #194
- Fix action errors by @k82cn in #195
- docs: minor grammar fix by @Monokaix in #196
- fix: killpg for instances. by @k82cn in #197
- fix: disable shuffle/backfill actions. by @k82cn in #198
- fix: update deserved by total slots. by @k82cn in #199
- chore: add SRA example. by @k82cn in #200
- chore: add list executor to flmctl. by @k82cn in #201
- Shuffle action by @k82cn in #202
- chore: update sqlite version. by @k82cn in #203
- chore: add dedicated thread in fsm. by @k82cn in #204
- fix: load applications. by @k82cn in #205
- fix: enhance ptr struct. by @k82cn in #206
- chore: add version for app/ssn/task. by @k82cn in #207
- fix: avoid over preemption. by @k82cn in #208
New Contributors
Full Changelog: v0.4.0...v0.5.0
v0.4.0
v0.3.0
- #16 Support HA for flame (@k82cn)
- #70 Add example about "multiply matrics" (@k82cn)
- #73 Add common data into SessionContext (@k82cn)
- #72 Add WASM shim (@k82cn)
- #63 Replace supervisord with skaffod/minikube (@k82cn)
- #52 Add task_id & session_id in stdio_shim (@k82cn)
- #20 Add python client (@k82cn)
- #57 Update flmping with flame rust client (@k82cn)
v0.2.0
This is the init version of Flame, it provides basic features by following Issues/MRs.
- #33: Add more integration test (@k82cn)
- #32: The output of flmctl list should be ordered (@k82cn)
- #34: Create multiple sessions (@k82cn)
- #30: Remove rpc model dependency from flame-client (@k82cn)
- #23: Add watch_task gRPC (@k82cn)
- #9: Add CI for Flame (@k82cn)
- #22: Support resource share between multiple sessions. (@k82cn)
- #12: Add Monte-Carlo Pi example (@k82cn)
- #14: Filter executors based on session spec in scheduler (@k82cn)
- #7: Fix 'cargo clippy' complain (@k82cn)
- #21: Replace String by Bytes for TaskOutput/TaskInput (@k82cn)
- #10: Move all APIs into common::apis mod. (@k82cn)
- #11: Support TaskOutput (@k82cn)
- #13: Build flame-client for the demo (@k82cn)
- #8: Add stdio_shim (@k82cn)