Skip to main content

MLflow 3.5.1

· 2 min read
MLflow maintainers
MLflow maintainers

MLflow 3.5.1 is a patch release that includes several bug fixes and improvements.

Features:

  • [CLI] Add CLI command to list registered scorers by experiment (#18255, @alkispoly-db)
  • [Deployments] Add configuration option for long-running deployments client requests (#18363, @BenWilson2)
  • [Deployments] Create set_databricks_monitoring_sql_warehouse_id API (#18346, @dbrx-euirim)
  • [Prompts] Show instructions for prompt optimization on prompt registry (#18375, @TomeHirata)

Bug fixes:

Documentation updates:

For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.

MLflow 3.5.0

· 5 min read
MLflow maintainers
MLflow maintainers

MLflow 3.5.0 includes several major features and improvements!

Major Features

  • ⚙️ Job Execution Backend: Introduced a new job execution backend infrastructure for running asynchronous tasks with individual execution pools, job search capabilities, and transient error handling. (#17676, #18012, #18070, #18071, #18112, #18049, @WeichenXu123)
  • 🎯 Flexible Prompt Optimization API: Introduced a new flexible API for prompt optimization with support for model switching and the GEPA algorithm, enabling more efficient prompt tuning with fewer rollouts. See the documentation to get started. (#18183, #18031, @TomeHirata)
  • 🎨 Enhanced UI Onboarding: Improved in-product onboarding experience with trace quickstart drawer and updated homepage guidance to help users discover MLflow's latest features. (#18098, #18187, @B-Step62)
  • 🔐 Security Middleware for Tracking Server: Added a security middleware layer to protect against DNS rebinding, CORS attacks, and other security threats. Read the documentation for configuration details. (#17910, @BenWilson2)

Features

Bug Fixes

Documentation Updates

For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.

MLflow 3.4.0

· 4 min read
MLflow maintainers
MLflow maintainers

MLflow 3.4.0 includes several major features and improvements

Major New Features

  • 📊 OpenTelemetry Metrics Export: MLflow now exports span-level statistics as OpenTelemetry metrics, providing enhanced observability and monitoring capabilities for traced applications. (#17325, @dbczumar)
  • 🤖 MCP Server Integration: Introducing the Model Context Protocol (MCP) server for MLflow, enabling AI assistants and LLMs to interact with MLflow programmatically. (#17122, @harupy)
  • 🧑‍⚖️ Custom Judges API: New make_judge API enables creation of custom evaluation judges for assessing LLM outputs with domain-specific criteria. (#17647, @BenWilson2, @dbczumar, @alkispoly-db, @smoorjani)
  • 📈 Correlations Backend: Implemented backend infrastructure for storing and computing correlations between experiment metrics using NPMI (Normalized Pointwise Mutual Information). (#17309, #17368, @BenWilson2)
  • 🗂️ Evaluation Datasets: MLflow now supports storing and versioning evaluation datasets directly within experiments for reproducible model assessment. (#17447, @BenWilson2)
  • 🔗 Databricks Backend for MLflow Server: MLflow server can now use Databricks as a backend, enabling seamless integration with Databricks workspaces. (#17411, @nsthorat)
  • 🤖 Claude Autologging: Automatic tracing support for Claude AI interactions, capturing conversations and model responses. (#17305, @smoorjani)
  • 🌊 Strands Agent Tracing: Added comprehensive tracing support for Strands agents, including automatic instrumentation for agent workflows and interactions. (#17151, @joelrobin18)
  • 🧪 Experiment Types in UI: MLflow now introduces experiment types, helping reduce clutter between classic ML/DL and GenAI features. MLflow auto-detects the type, but you can easily adjust it via a selector next to the experiment name. (#17605, @daniellok-db)

Features:

  • [Evaluation] Add ability to pass tags via dataframe in mlflow.genai.evaluate (#17549, @smoorjani)
  • [Evaluation] Add custom judge model support for Safety and RetrievalRelevance builtin scorers (#17526, @dbrx-euirim)
  • [Tracing] Add AI commands as MCP prompts for LLM interaction (#17608, @nsthorat)
  • [Tracing] Add MLFLOW_ENABLE_OTLP_EXPORTER environment variable (#17505, @dbczumar)
  • [Tracing] Support OTel and MLflow dual export (#17187, @dbczumar)
  • [Tracing] Make set_destination use ContextVar for thread safety (#17219, @B-Step62)
  • [CLI] Add MLflow commands CLI for exposing prompt commands to LLMs (#17530, @nsthorat)
  • [CLI] Add 'mlflow runs link-traces' command (#17444, @nsthorat)
  • [CLI] Add 'mlflow runs create' command for programmatic run creation (#17417, @nsthorat)
  • [CLI] Add MLflow traces CLI command with comprehensive search and management capabilities (#17302, @nsthorat)
  • [CLI] Add --env-file flag to all MLflow CLI commands (#17509, @nsthorat)
  • [Tracking] Backend for storing scorers in MLflow experiments (#17090, @WeichenXu123)
  • [Model Registry] Allow cross-workspace copying of model versions between WMR and UC (#17458, @arpitjasa-db)
  • [Models] Add automatic Git-based model versioning for GenAI applications (#17076, @harupy)
  • [Models] Improve WheeledModel._download_wheels safety (#17004, @serena-ruan)
  • [Projects] Support resume run for Optuna hyperparameter optimization (#17191, @lu-wang-dl)
  • [Scoring] Add MLFLOW_DEPLOYMENT_CLIENT_HTTP_REQUEST_TIMEOUT environment variable (#17252, @dbczumar)
  • [UI] Add ability to hide/unhide all finished runs in Chart view (#17143, @joelrobin18)
  • [Telemetry] Add MLflow OSS telemetry for invoke_custom_judge_model (#17585, @dbrx-euirim)

Bug fixes:

  • [Evaluation] Implement DSPy LM interface for default Databricks model serving (#17672, @smoorjani)
  • [Evaluation] Fix aggregations incorrectly applied to legacy scorer interface (#17596, @BenWilson2)
  • [Evaluation] Add Unity Catalog table source support for mlflow.evaluate (#17546, @BenWilson2)
  • [Evaluation] Fix custom prompt judge encoding issues with custom judge models (#17584, @dbrx-euirim)
  • [Tracking] Fix OpenAI autolog to properly reconstruct Response objects from streaming events (#17535, @WeichenXu123)
  • [Tracking] Add basic authentication support in TypeScript SDK (#17436, @kevin-lyn)
  • [Tracking] Update scorer endpoints to v3.0 API specification (#17409, @WeichenXu123)
  • [Tracking] Fix scorer status handling in MLflow tracking backend (#17379, @WeichenXu123)
  • [Tracking] Fix missing source-run information in UI (#16682, @WeichenXu123)
  • [Scoring] Fix spark_udf to always use stdin_serve for model serving (#17580, @WeichenXu123)
  • [Scoring] Fix a bug with Spark UDF usage of uv as an environment manager (#17489, @WeichenXu123)
  • [Model Registry] Extract source workspace ID from run_link during model version migration (#17600, @arpitjasa-db)
  • [Models] Improve security by reducing write permissions in temporary directory creation (#17544, @BenWilson2)
  • [Server-infra] Fix --env-file flag compatibility with --dev mode (#17615, @nsthorat)
  • [Server-infra] Fix basic authentication with Uvicorn server (#17523, @kevin-lyn)
  • [UI] Fix experiment comparison functionality in UI (#17550, @Flametaa)
  • [UI] Fix compareExperimentsSearch route definitions (#17459, @WeichenXu123)

Documentation updates:

For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.

MLflow 3.4.0rc0

· 4 min read
MLflow maintainers
MLflow maintainers

MLflow 3.4.0rc0 is a release candidate for 3.4.0. To install, run the following command:

pip install mlflow==3.4.0rc0

MLflow 3.4.0rc0 includes several major features and improvements

Major New Features

  • 📊 OpenTelemetry Metrics Export: MLflow now exports span-level statistics as OpenTelemetry metrics, providing enhanced observability and monitoring capabilities for traced applications. (#17325, @dbczumar)
  • 🤖 MCP Server Integration: Introducing the Model Context Protocol (MCP) server for MLflow, enabling AI assistants and LLMs to interact with MLflow programmatically. (#17122, @harupy)
  • 🧑‍⚖️ Custom Judges API: New make_judge API enables creation of custom evaluation judges for assessing LLM outputs with domain-specific criteria. (#17647, @BenWilson2, @dbczumar, @alkispoly-db, @smoorjani)
  • 📈 Correlations Backend: Implemented backend infrastructure for storing and computing correlations between experiment metrics using NPMI (Normalized Pointwise Mutual Information). (#17309, #17368, @BenWilson2)
  • 🗂️ Evaluation Datasets: MLflow now supports storing and versioning evaluation datasets directly within experiments for reproducible model assessment. (#17447, @BenWilson2)
  • 🔗 Databricks Backend for MLflow Server: MLflow server can now use Databricks as a backend, enabling seamless integration with Databricks workspaces. (#17411, @nsthorat)
  • 🤖 Claude Autologging: Automatic tracing support for Claude AI interactions, capturing conversations and model responses. (#17305, @smoorjani)
  • 🌊 Strands Agent Tracing: Added comprehensive tracing support for Strands agents, including automatic instrumentation for agent workflows and interactions. (#17151, @joelrobin18)

Features:

  • [Evaluation] Add ability to pass tags via dataframe in mlflow.genai.evaluate (#17549, @smoorjani)
  • [Evaluation] Add custom judge model support for Safety and RetrievalRelevance builtin scorers (#17526, @dbrx-euirim)
  • [Tracing] Add AI commands as MCP prompts for LLM interaction (#17608, @nsthorat)
  • [Tracing] Add MLFLOW_ENABLE_OTLP_EXPORTER environment variable (#17505, @dbczumar)
  • [Tracing] Support OTel and MLflow dual export (#17187, @dbczumar)
  • [Tracing] Make set_destination use ContextVar for thread safety (#17219, @B-Step62)
  • [CLI] Add MLflow commands CLI for exposing prompt commands to LLMs (#17530, @nsthorat)
  • [CLI] Add 'mlflow runs link-traces' command (#17444, @nsthorat)
  • [CLI] Add 'mlflow runs create' command for programmatic run creation (#17417, @nsthorat)
  • [CLI] Add MLflow traces CLI command with comprehensive search and management capabilities (#17302, @nsthorat)
  • [CLI] Add --env-file flag to all MLflow CLI commands (#17509, @nsthorat)
  • [Tracking] Backend for storing scorers in MLflow experiments (#17090, @WeichenXu123)
  • [Model Registry] Allow cross-workspace copying of model versions between WMR and UC (#17458, @arpitjasa-db)
  • [Models] Add automatic Git-based model versioning for GenAI applications (#17076, @harupy)
  • [Models] Improve WheeledModel._download_wheels safety (#17004, @serena-ruan)
  • [Projects] Support resume run for Optuna hyperparameter optimization (#17191, @lu-wang-dl)
  • [Scoring] Add MLFLOW_DEPLOYMENT_CLIENT_HTTP_REQUEST_TIMEOUT environment variable (#17252, @dbczumar)
  • [UI] Add ability to hide/unhide all finished runs in Chart view (#17143, @joelrobin18)
  • [Telemetry] Add MLflow OSS telemetry for invoke_custom_judge_model (#17585, @dbrx-euirim)

Bug fixes:

  • [Evaluation] Implement DSPy LM interface for default Databricks model serving (#17672, @smoorjani)
  • [Evaluation] Fix aggregations incorrectly applied to legacy scorer interface (#17596, @BenWilson2)
  • [Evaluation] Add Unity Catalog table source support for mlflow.evaluate (#17546, @BenWilson2)
  • [Evaluation] Fix custom prompt judge encoding issues with custom judge models (#17584, @dbrx-euirim)
  • [Tracking] Fix OpenAI autolog to properly reconstruct Response objects from streaming events (#17535, @WeichenXu123)
  • [Tracking] Add basic authentication support in TypeScript SDK (#17436, @kevin-lyn)
  • [Tracking] Update scorer endpoints to v3.0 API specification (#17409, @WeichenXu123)
  • [Tracking] Fix scorer status handling in MLflow tracking backend (#17379, @WeichenXu123)
  • [Tracking] Fix missing source-run information in UI (#16682, @WeichenXu123)
  • [Scoring] Fix spark_udf to always use stdin_serve for model serving (#17580, @WeichenXu123)
  • [Scoring] Fix a bug with Spark UDF usage of uv as an environment manager (#17489, @WeichenXu123)
  • [Model Registry] Extract source workspace ID from run_link during model version migration (#17600, @arpitjasa-db)
  • [Models] Improve security by reducing write permissions in temporary directory creation (#17544, @BenWilson2)
  • [Server-infra] Fix --env-file flag compatibility with --dev mode (#17615, @nsthorat)
  • [Server-infra] Fix basic authentication with Uvicorn server (#17523, @kevin-lyn)
  • [UI] Fix experiment comparison functionality in UI (#17550, @Flametaa)
  • [UI] Fix compareExperimentsSearch route definitions (#17459, @WeichenXu123)

Documentation updates:

Please try it out and report any issues on the issue tracker.

MLflow 3.3.2

· One min read
MLflow maintainers
MLflow maintainers

MLflow 3.3.2 is a patch release that includes several minor improvements and bugfixes

Features:

Bug fixes:

Documentation updates:

For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.

MLflow 3.3.1

· One min read
MLflow maintainers
MLflow maintainers

MLflow 3.3.1 includes several major features and improvements

Bug fixes:

Small bug fixes and documentation updates:

#17295, @gunsodo; #17272, @bbqiu

For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.

MLflow 3.3.0

· 2 min read
MLflow maintainers
MLflow maintainers

MLflow 3.3.0 includes several major features and improvements

Eval UI

Major new features:

  • 🪝 Model Registry Webhooks: MLflow now supports webhooks for model registry events, enabling automated notifications and integrations with external systems. (#16583, @harupy)
  • 🧭 Agno Tracing Integration: Added Agno tracing integration for enhanced observability of AI agent workflows. (#16995, @joelrobin18)
  • 🧪 GenAI Evaluation in OSS: MLflow open-sources the new evaluation capability for LLM applications. This suite enables systematic measurement and improvement of LLM application quality, with tight integration into MLflow's observability, feedback collection, and experiment tracking capabilities. (#17161, #17159, @B-Step62)
  • 🖥️ Revamped Trace Table View: The new trace view in MLflow UI provides a streamlined interface for exploring, filtering, and monitoring traces, with enhanced search capabilities including full-text search across requests.(#17092, @daniellok-db)
  • ⚡️ FastAPI + Uvicorn Server: MLflow Tracking Server now defaults to FastAPI + Uvicorn for improved performance, while maintaining Flask compatibility. (#17038, @dbczumar)

New features:

  • [Tracking] Add a Docker compose file to quickly start a local MLflow server with recommended minimum setup (#17065, @joelrobin18)
  • [Tracing] Add memory span type for agentic workflows (#17034, @B-Step62)
  • [Prompts] Enable custom prompt optimizers in optimize_prompt including DSPy support (#17052, @TomeHirata)
  • [Model Registry / Prompts] Proper support for the @latest alias (#17146, @B-Step62)
  • [Metrics] Allow custom tokenizer encoding in token_count function (#16253, @joelrobin18)

Bug fixes:

  • [Tracking] Fix Databricks secret scope check to reduce audit log errors (#17166, @harupy)
  • [Tracking] Fix Databricks SDK error code mapping in retry logic (#17095, @harupy)
  • [Tracking] Fix Databricks secret scope check to reduce error rates (#17166, @harupy)
  • [Tracing] Remove API keys from CrewAI traces to prevent credential leakage (#17082, @diy2learn)
  • [Tracing] Fix LiteLLM span association issue by making callbacks synchronous (#16982, @B-Step62)
  • [Tracing] Fix OpenAI Agents tracing (#17227, @B-Step62)
  • [Evaluation] Fix issue with get_label_schema has no attribute (#17163, @smoorjani)
  • [Docs] Fix version selector on API Reference page by adding missing CSS class and versions.json generation (#17247, @copilot-swe-agent)

Documentation updates:

  • [Docs] Document custom optimizer usage with optimize_prompt (#17084, @TomeHirata)
  • [Docs] Fix built-in scorer documentation for expectation parameter (#17075, @smoorjani)
  • [Docs] Add comprehensive documentation for scorers (#17258, @B-Step62)

For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.

MLflow 3.2.0

· 5 min read
MLflow maintainers
MLflow maintainers

MLflow 3.2.0 includes several major features and improvements

Major New Features

📊 Usage Tracking (New in 3.2.0)

  • Starting with version 3.2.0, MLflow will begin collecting anonymized usage data about how core features of the platform are used. This data contains no sensitive or personally identifiable information, and users can opt out of data collection at any time. Check MLflow documentation for more details. (#16439, @serena-ruan)

Features:

Bug fixes:

  • [Tracking / UI] Add missing default headers and replace absolute URLs in new browser client requests (GraphQL & logged models) (#16840, @danilopeixoto)
  • [Tracking] Fix tracking_uri positional argument bug in artifact repositories (#16878, @copilot-swe-agent)
  • [Models] Fix UnionType support for Python 3.10 style union syntax (#16882, @harupy)
  • [Tracing / Tracking] Fix OpenAI autolog Pydantic validation for enum values (#16862, @mohammadsubhani)
  • [Tracking] Fix tracing for Anthropic and Langchain combination (#15151, @maver1ck)
  • [Models] Fix OpenAI multimodal message logging support (#16795, @mohammadsubhani)
  • [Tracing] Avoid using nested threading for Azure Databricks trace export (#16733, @TomeHirata)
  • [Evaluation] Bug fix: Databricks GenAI evaluation dataset source returns string, instead of DatasetSource instance (#16712, @dbczumar)
  • [Models] Fix get_model_info to provide logged model info (#16713, @harupy)
  • [Evaluation] Fix serialization and deserialization for python scorers (#16688, @connorchenn)
  • [UI] Fix GraphQL handler erroring on NaN metric values (#16628, @daniellok-db)
  • [UI] Add back video artifact preview (#16620, @daniellok-db)
  • [Tracing] Proper chat message reconstruction from OAI streaming response (#16519, @B-Step62)
  • [Tracing] Convert trace column in search_traces() response to JSON string (#16523, @B-Step62)
  • [Evaluation] Fix mlflow.evaluate crashes in _get_binary_classifier_metrics due to … (#16485, @mohammadsubhani)
  • [Evaluation] Fix trace detection logic for mlflow.genai.evaluate (#16932, @B-Step62)
  • [Evaluation] Enable to use make_genai_metric_from_prompt for mlflow.evaluate (#16960, @TomeHirata)
  • [Models] Add explicit encoding for decoding streaming Responses (#16855, @aravind-segu)
  • [Tracking] Prevent from tracing DSPy model API keys (#17021, @czyzby)
  • [Tracking] Fix pytorch datetime issue (#17030, @serena-ruan)
  • [Tracking] Fix predict with pre-releases (#16998, @serena-ruan)

Documentation updates:

For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.

MLflow 3.1.1

· One min read
MLflow maintainers
MLflow maintainers

MLflow 3.1.1 includes several major features and improvements

Features:

  • [Model Registry / Sqlalchemy] Increase prompt text limit from 5K to 100K (#16377, @harupy)
  • [Tracking] Support pagination in get-history of FileStore and SqlAlchemyStore (#16325, @TomeHirata)

Bug fixes:

  • [Artifacts] Support downloading logged model artifacts (#16356, @TomeHirata)
  • [Models] Fix bedrock provider, configured inference profile compatibility (#15604, @lloydhamilton)
  • [Tracking] Specify attribute.run_id when search_traces filters by run_id (#16295, @artjen)
  • [Tracking] Fix graphql batching attacks (#16227, @serena-ruan)
  • [Model Registry] Make the chunk size configurable in DatabricksSDKModelsArtifactRepository (#16247, @TomeHirata)

Documentation updates:

For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.

MLflow 3.0.1

· One min read
MLflow maintainers
MLflow maintainers

MLflow 3.0.1 includes major features and bug fixes.

Features:

  • [Model Registry / Sqlalchemy] Increase prompt text limit from 5K to 100K (#16377, @harupy)

Bug fixes:

  • [Models] Fix bedrock provider, configured inference profile compatibility (#15604, @lloydhamilton)

For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.