Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically add created, last_changed and changed_by metadata for index templates, component templates and ingest pipelines #108754

Open
flash1293 opened this issue May 17, 2024 · 6 comments
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Team:Data Management Meta label for data/management team

Comments

@flash1293
Copy link
Contributor

flash1293 commented May 17, 2024

Description

Related to #108469

To be able to troubleshoot ingestion issues, it's helpful to know what changed around the time problems started. A full version control system of ES objects is an aspirational goal here, but as a low hanging fruit the meta information about when an ES object was created, last changed and by whom would give a lot of the value without big investments.

Scope

For index templates, component templates and ingest pipelines, track created_at, modified_at and modified_by and return as part of the respective APIs to retrieve these objects:

GET _ingest/pipeline/my-pipeline

{
  "my-pipeline": {
    "processors": [
      ...
    ],
    "_meta": {
     ...
    },
   "created_at": "2025-05-05T00:00:00",
   "modified_at": "2025-05-05T00:00:00",
   "modified_by": "user_xyz",
  }
}
GET _component_template/my-template

{
  "component_templates": [
    {
      "name": "my-template",
      "component_template": {
        "template": {
          ... 
        },
        "version": 3,
        "_meta": {
          ...
        },
        "created_at": "2025-05-05T00:00:00",
        "modified_at": "2025-05-05T00:00:00",
        "modified_by": "user_xyz",
      }
    }
  ]
}
GET _index_template/my-template

{
  "index_templates": [
    {
      "name": "my-template",
      "index_template": {
        ...,
        "created_at": "2025-05-05T00:00:00",
        "modified_at": "2025-05-05T00:00:00",
        "modified_by": "user_xyz",
      }
    }
  ]
}

It's not possible to write these properties.

Considerations

Merge with _meta

The existing _meta object is user-controlled. An alternative approach would be to add these pieces of information to this object and override whatever the user set manually. While cleaner in the sense of not introducing more properties on these objects, this has some downsides:

  • Information can potentially be falsified
  • Might interfere with custom user-defined tracking solutions - breaking change?

Permissions

This approach might leak usernames to other users that normally wouldn't have access to them. This could be counteracted by only returning modified_by if the user has the required permissions to read user data in the first place.

@flash1293 flash1293 added >enhancement Team:Data Management Meta label for data/management team needs:triage Requires assignment of a team area label labels May 17, 2024
@elasticsearchmachine elasticsearchmachine removed the Team:Data Management Meta label for data/management team label May 17, 2024
@ruflin
Copy link
Member

ruflin commented May 17, 2024

Eventually we should also find ways to query this data: Give me the ingest pipelines that changed in the last 24h

@flash1293
Copy link
Contributor Author

@ruflin that's a good thought, seems like a follow-up issue to me.

@pgomulka pgomulka added the :Data Management/ILM+SLM Index and Snapshot lifecycle management label May 17, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label May 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label May 17, 2024
@dakrone
Copy link
Member

dakrone commented May 17, 2024

While I wish we could have a standardization here, we do have a field like this for ILM policies already: modified_date. If we want to add these, it would be good to standardize on exactly what names we want.

Is there a reason you prepended an underscore to the names? I don't think we would necessarily have to do that.

@ruflin
Copy link
Member

ruflin commented May 21, 2024

++ on standardisation.

The underscore is to indicate this is something created by the system and can't be modified by the user. It is likely also to prevent conflicts with user defined fields but agree, we might not need this.

@flash1293
Copy link
Contributor Author

I don't feel strongly about the _, it was mostly just a starting point. I edited the issue to align with existing names

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

5 participants