测试

langsmith.testing ¶

LangSmith pytest 测试模块。

函数	描述
`log_feedback`	在 pytest 测试运行中记录运行反馈。
`log_inputs`	在 pytest 测试运行中记录运行输入。
`log_outputs`	在 pytest 测试运行中记录运行输出。
`log_reference_outputs`	在 pytest 测试运行中记录示例参考输出。
`trace_feedback`	将 pytest 运行反馈的计算作为其自身的运行进行跟踪。

log_feedback ¶

log_feedback(
    feedback: dict | list[dict] | None = None,
    /,
    *,
    key: str,
    score: int | bool | float | None = None,
    value: str | int | float | bool | None = None,
    **kwargs: Any,
) -> None

在 pytest 测试运行中记录运行反馈。

.. 警告：

This API is in beta and might change in future versions.

仅应在由 @pytest.mark.langsmith 装饰的 pytest 测试中使用。

参数	描述
`key`	反馈名称。类型： `str`
`score`	数值反馈值。类型： `int \| bool \| float \| None` 默认值： `None`
`value`	分类反馈值类型： `str \| int \| float \| bool \| None` 默认值： `None`
`kwargs`	任何其他的 Client.create_feedback 参数。类型： `Any` 默认值： `{}`

示例

.. code-block:: python

import pytest
from langsmith import testing as t


@pytest.mark.langsmith
def test_foo() -> None:
    x = 0
    y = 1
    expected = 2
    result = foo(x, y)
    t.log_feedback(key="right_type", score=isinstance(result, int))
    assert result == expected

log_inputs ¶

log_inputs(inputs: dict) -> None

在 pytest 测试运行中记录运行输入。

.. 警告：

This API is in beta and might change in future versions.

仅应在由 @pytest.mark.langsmith 装饰的 pytest 测试中使用。

参数	描述
`inputs`	要记录的输入。类型： `dict`

示例

.. code-block:: python

from langsmith import testing as t


@pytest.mark.langsmith
def test_foo() -> None:
    x = 0
    y = 1
    t.log_inputs({"x": x, "y": y})
    assert foo(x, y) == 2

log_outputs ¶

log_outputs(outputs: dict) -> None

在 pytest 测试运行中记录运行输出。

.. 警告：

This API is in beta and might change in future versions.

仅应在由 @pytest.mark.langsmith 装饰的 pytest 测试中使用。

参数	描述
`outputs`	要记录的输出。类型： `dict`

示例

.. code-block:: python

from langsmith import testing as t


@pytest.mark.langsmith
def test_foo() -> None:
    x = 0
    y = 1
    result = foo(x, y)
    t.log_outputs({"foo": result})
    assert result == 2

log_reference_outputs ¶

log_reference_outputs(reference_outputs: dict) -> None

在 pytest 测试运行中记录示例参考输出。

.. 警告：

This API is in beta and might change in future versions.

仅应在由 @pytest.mark.langsmith 装饰的 pytest 测试中使用。

参数	描述
`outputs`	要记录的参考输出。

示例

.. code-block:: python

from langsmith import testing


@pytest.mark.langsmith
def test_foo() -> None:
    x = 0
    y = 1
    expected = 2
    testing.log_reference_outputs({"foo": expected})
    assert foo(x, y) == expected

trace_feedback ¶

trace_feedback(*, name: str = 'Feedback') -> Generator[RunTree | None, None, None]

将 pytest 运行反馈的计算作为其自身的运行进行跟踪。

.. 警告：

This API is in beta and might change in future versions.

参数	描述
`name`	反馈运行名称。默认为 "Feedback"。类型： `str` 默认值： `'Feedback'`

示例

.. code-block:: python

import openai
import pytest

from langsmith import testing as t
from langsmith import wrappers

oai_client = wrappers.wrap_openai(openai.Client())


@pytest.mark.langsmith
def test_openai_says_hello():
    # Traced code will be included in the test case
    text = "Say hello!"
    response = oai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": text},
        ],
    )
    t.log_inputs({"text": text})
    t.log_outputs({"response": response.choices[0].message.content})
    t.log_reference_outputs({"response": "hello!"})

    # Use this context manager to trace any steps used for generating evaluation
    # feedback separately from the main application logic
    with t.trace_feedback():
        grade = oai_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {
                    "role": "system",
                    "content": "Return 1 if 'hello' is in the user message and 0 otherwise.",
                },
                {
                    "role": "user",
                    "content": response.choices[0].message.content,
                },
            ],
        )
        # Make sure to log relevant feedback within the context for the
        # trace to be associated with this feedback.
        t.log_feedback(
            key="llm_judge", score=float(grade.choices[0].message.content)
        )

    assert "hello" in response.choices[0].message.content.lower()