Sekcja Model Details
Identifies and contextualizes the model for users and auditors.
Basic information about the model: who developed it, version, model type, training algorithm, parameters, and links to further resources.
Introduction of a standardized, concise document accompanying a trained ML model that mandatorily reports benchmark results disaggregated across demographic, environmental, and technical groups, enabling bias detection and preventing model misuse.
Identifies and contextualizes the model for users and auditors.
Basic information about the model: who developed it, version, model type, training algorithm, parameters, and links to further resources.
Preventing model misuse by clearly defining its intended purpose and boundaries.
Intended use cases, target users, and explicitly out-of-scope applications.
Identifies the dimensions along which evaluation results should be disaggregated.
Relevant factors affecting model performance: demographic groups, environmental conditions, instrumentation attributes. Distinguishes relevant factors from evaluation factors.
Defining the metrics used to evaluate model quality.
Performance metrics and decision thresholds, with justification of their selection for the model type and use case.
Ensuring transparency of the data used to measure model performance.
Description of evaluation datasets: source, composition, and representativeness for intended use cases.
Documenting the data that shapes model behavior.
General description of training data provenance and statistical composition, accounting for privacy constraints.
Disclosing model bias and performance variation across demographic subgroups.
Disaggregated evaluation results per factor, covering both unitary and intersectional analyses.
Ethical transparency and responsible AI.
Identification of sensitive data used, implications for human life, risks and harms, and mitigation measures.
Supplementing documentation with custom, model-specific information.
Additional concerns and recommendations not covered in previous sections: limitations, known gaps, and user guidance.
Machine-parseable metadata enabling filtering, discovery, and integration with external systems.
Optional YAML front-matter at the top of the README.md file (Hugging Face Hub implementation): license, language, pipeline_tag, base_model, datasets, eval_results.
Factor Granularity
Number and specificity of demographic, environmental, and instrumental factors included in evaluation.
Intersectional Analysis
Whether reported results cover only unitary (per-factor) or also intersectional (combined-factor) analysis.
Evaluation Data Scope
Scope of disclosed information about evaluation datasets.
Training Data Transparency
Scope of disclosed information about training data, often constrained by privacy or IP restrictions.
Model card authors often report results only per single factor, omitting intersectional analysis (e.g., older women vs. young men) that reveals biases invisible in unitary analysis.
Per the source paper's recommendations (section 4.7), report both disaggregated and intersectional results for all combinations of identified factors.
Ethical considerations section is often filled with generic statements rather than specific identification of risks, sensitive data, and mitigation measures.
Document specific risk scenarios and the mitigation measures taken; identify any sensitive data used and the rationale for its inclusion.
Model cards often report only aggregate metrics without breakdown by demographic subgroups, masking differential model performance across populations.
Report disaggregated results for each identified factor, with confidence intervals.
Model card is created once at initial deployment and not updated with subsequent versions or fine-tunes, leading to discrepancies between documentation and actual model.
Treat the model card as a living artifact: update it with every new version, fine-tuning run, or significant change in intended use.
Markdown / YAML ยท Hugging Face
Python ยท Hugging Face
Python ยท Google
Markdown / YAML ยท Hugging Face
GENESIS ยท Source paper
Model Cards for Model ReportingPublication of the 'Model Cards for Model Reporting' preprint (Mitchell et al.) on arXiv
breakthroughMitchell et al. published the Model Cards for Model Reporting preprint on arXiv on October 5, 2018, proposing standardized model documentation with disaggregated evaluation across demographic groups.
Published at FAT* 2019 and formally adopted by ACM
breakthroughPaper published at ACM FAT* 2019, formally introducing Model Cards into the research and industry discourse.
Google releases open-source Model Card Toolkit
Google released the open-source Model Card Toolkit, a Python library automating model card generation from TensorFlow Model Analysis evaluation data.
Hugging Face publishes Model Card Guidebook (Ozoani, Gerchick, Mitchell)
Hugging Face published the Model Card Guidebook with an updated template integrated into the huggingface_hub library, a Model Card Creator Tool, and a landscape analysis of ML documentation.
Model Cards are a documentation pattern โ text files (Markdown/YAML) independent of hardware. They can describe models running on any hardware.
The Model Details section of a card may optionally include information about the hardware used for training and serving the model (e.g., GPU requirements, training time, carbon footprint).