Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance impact due to No Fact caching #658

Open
pemsith opened this issue Jan 13, 2025 · 1 comment
Open

Performance impact due to No Fact caching #658

pemsith opened this issue Jan 13, 2025 · 1 comment

Comments

@pemsith
Copy link

pemsith commented Jan 13, 2025

Description

The repository contains Ansible playbooks that gather and utilize facts but do not specify a persistent cache plugin in the ansible.cfg file. By default, the memory cache plugin is used, which only retains data during the current execution of Ansible and does not persist across runs. Without fact caching setup, it shows low performance due to repeated fact gathering and increased memory consumption during long or complex playbook executions. The absence of a persistent cache plugin impacts scalability and efficiency, especially when working with large inventories or multiple playbook runs.

Expected Behavior

The request is to either add the below suggested in your repo delivered ansible.cfg files or make a strong recommendation for users to update it as such.
An appropriate cache plugin should be configured in ansible.cfg to persist facts across Ansible runs. Plugins such as jsonfile, redis, or yaml can be used to store facts persistently, reducing repeated fact gathering, improving performance, and minimizing memory consumption.

Expected example configuration in ansible.cfg:
fact_caching = jsonfile
fact_caching_connection = /path/to/fact_cache
fact_caching_timeout = 7200

Actual Behavior

performance impact due to no persistent fact caching plugin used

Environment

CentOS stream 9

@cschug
Copy link
Collaborator

cschug commented Jan 13, 2025

Those settings alone won't have any effect because the default value of DEFAULT_GATHERING is set to implicit which ignores caches, i.e. facts would still be gathered unless in the playbook gather_facts is set to false (and in this case the existence of a cache doesn't matter, too).

And changing DEFAULT_GATHERING to either explicit or smart is opening a can of worms unless all your playbooks and roles are designed that way. Otherwise they might take actions based on outdated facts leading to potentially undesired results.

IMHO this a performance optimization which someone can do locally in a controlled environment where he/she is fully aware of this change by opting in, but changing the default behaviour of Ansible without notice sounds questionable to me. In my opinion, a default should never sacrifice robustness/expected behavior over execution time.

FWIW, if using those settings, I would prefer Ansible's FQCN naming schema, hence

fact_caching = ansible.builtin.jsonfile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants